NOVEL PROTEINS AND NUCLEIC ACIDS ENCODING SAME 
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FIELD OF THE INVENTION 

The invention relates to polynucleotides and the polypeptides encoded by such 
polynucleotides, as well as vectors, host cells, antibodies and recombinant methods for 
producing the polypeptides and polynucleotides, as well as methods for using the same. 

BACKGROUND OF THE INVENTION 

The present invention is based in part on nucleic acids encoding proteins that are new 
members of the following protein families: Zinc Finger-like proteins. Pepsin A Precursor-like 
proteins, Ribonuclease Pancreatic-like proteins, Ser/Thr Protein Kinase-Iike proteins, 
Glycodelin-like proteins. Neuropathy Target Esterase/Swiss Cheese Protein-like proteins, 
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Acid-Sensitive Potassium Channel Protein Task-like protein. Novel Ribosomal Protein L8- 
like proteins. Prostaglandin Omega Hydroxylase-like proteins. Myeloid Upregulated Protein- 
like proteins. Testicular Serine Protease-like proteins. Hepatitis B Virus (HBV) Associated 
Factor-like proteins, Apolipoprotein L-like proteins, Rh Type C Glycoprotein-like proteins, 
Copine Ill-like protiens, Carboxypeptidase B Pancreatic-like proteins, Ribosomal Protein 
L29-like proteins, Ser/Thr kinase-like proteins, Metallaproteinase-Disintegrin (ADAM30)- 
like proteins. Bone Morphogenetic Protein 1 l-like proteins. Protein Tyrosine Phosphatase- 
like proteins, Aldo-Keto Reductase Family 7, Member A3-like proteins, Ral Guanine 
Nucleotide Exchange Factor 3-like proteins, Endolyn-like proteins, Arylacetamide 
Deacetylase-like proteins, GPCR-like proteins, PB39"Hke proteins, Oxytocin-like proteins. 
Thymosin beta-4-like proteins, beta Thymosin-like proteins. Thymosin Beta-4-Hke proteins, 
Mylein P2-like proteins. Testis Lipid-Binding Protein-like proteins, Intracellular 
Thrombospondin Domain Containing Protein-like protein. Ornithine Decarboxylase-like 
protein, Short-Chain Dehydrogenase/Reductase-like protein, Protocadherin Beta 3-Iike 
protein and Adrenomedullin Receptor-like protein. More particularly, the invention relates to 
nucleic acids encoding novel polypeptides, as well as vectors, host cells, antibodies, and 
recombinant methods for producing these nucleic acids and polypeptides. 

SUMMARY OF THE INVENTION 

The invention is based in part upon the discovery of nucleic acid sequences encoding 
novel polypeptides. The novel nucleic acids and polypeptides are referred to herein as 
NOVX, or NOV], NOV2, NOV3, NOV4, NOV5, NOV6, NOV7, NOV8, NOV9, NOVIO, 
NOVl 1, NOV12, NOV13, NOV14, NOV15, NOV16, NOV17, NOV18, NOV19, NOV20, 
NOV21, NOV22, NOV23, NOV24, NOV25, NOV26, NOV27, NOV28, NOV29, NOV30, 
NOV31, NOV32, NOV33, NOV34, NOV35, NOV36, and NOV37 nucleic acids and 
polypeptides. These nucleic acids and polypeptides, as well as derivatives, homologs, 
analogs and fragments thereof, will hereinafter be collectively designated as "NOVX" nucleic 
acid or polypeptide sequences. 

In one aspect, the invention provides an isolated NOVX nucleic acid molecule 
encoding a NOVX polypeptide that includes a nucleic acid sequence that has identity to the 
nucleic acids disclosed in SEQ ID NOS:l, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 
33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77, 79, 81, 
83, 85, 87, 89, 91, 93, 95, 97, 99, 101, 103, 105, 107, 109, and 111. In some embodiments, 
the NOVX nucleic acid molecule will hybridize under stringent conditions to a nucleic acid 



sequence complementary to a nucleic acid molecule that includes a protein-coding sequence 
of a NOVX nucleic acid sequence. The invention also includes an isolated nucleic acid that 
encodes a NOVX polypeptide, or a fragment, homolog, analog or derivative thereof. For 
example, the nucleic acid can encode a polypeptide at least 80% identical to a polypeptide 
comprising the amino acid sequences of SEQ ID NOS:2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 
24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 
74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102, 104, 106, 108, 1 10, and 1 12. The 
nucleic acid can be, for example, a genomic DNA fragment or a cDNA molecule that 
includes the nucleic acid sequence of any of SEQ ID NOS: 1 , 3, 5, 7, 9, 1 1 , 1 3, 1 5, 1 7, 1 9, 2 1 , 
23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 
73, 75, 77, 79, 81, 83, 85, 87, 89, 91, 93, 95, 97, 99, 101, 103, 105, 107, 109, and 1 1 L 

Also included in the invention is an oligonucleotide, e,g,, an oligonucleotide which 
includes at least 6 contiguous nucleotides of a NOVX nucleic acid (e.g., SEQ ID NOS:l, 3, 5, 
7, 9, 1 1, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 
57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77, 79, 81, 83, 85, 87, 89, 91, 93, 95, 97, 99, 101, 103, 
105, 107, 109, and 1 1 1) or a complement of said oligonucleotide. Also included in the 
invention are substantially purified NOVX polypeptides (SEQ ID NOS:2, 4, 6, 8, 10, 12, 14, 
16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 
66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102, 104, 106, 108, 110, 
and 112). In certain embodiments, the NOVX polypeptides include an amino acid sequence 
that is substantially identical to the amino acid sequence of a human NOVX polypeptide. 

The invention also features antibodies that immunoselectively bind to NOVX 
polypeptides, or fragments, homologs, analogs or derivatives thereof 

In another aspect, the invention includes pharmaceutical compositions that include 
therapeutically- or prophylactically-effective amounts of a therapeutic and a 
pharmaceutically-acceptable carrier. The therapeutic can be, e.g., a NOVX nucleic acid, a 
NOVX polypeptide, or an antibody specific for a NOVX polypeptide. In a further aspect, the 
invention includes, in one or more containers, a therapeutically- or prophylacticaJly-effective 
amount of this pharmaceutical composition. 

In a further aspect, the invention includes a method of producing a polypeptide by 
culturing a cell that includes a NOVX nucleic acid, under conditions allowing for expression 
of the NOVX polypeptide encoded by the DNA. If desired, the NOVX polypeptide can then 
be recovered. 



In another aspect, the invention includes a method of detecting the presence of a 
NOVX polypeptide in a sample. In the method, a sample is contacted with a compound that 
selectively binds to the polypeptide under conditions allowing for formation of a complex 
between the polypeptide and the compound. The complex is detected, if present, thereby 
identifying the NOVX polypeptide within the sample. 

The invention also includes methods to identify specific cell or tissue types based on 
their expression of a NOVX. 

Also included in the invention is a method of detecting the presence of a NOVX 
nucleic acid molecule in a sample by contacting the sample with a NOVX nucleic acid probe 
or primer, and detecting whether the nucleic acid probe or primer bound to a NOVX nucleic 
acid molecule in the sample. 

In a further aspect, the invention provides a method for modulating the activity of a 
NOVX polypeptide by contacting a cell sample that includes the NOVX polypeptide with a 
compound that binds to the NOVX polypeptide in an amount sufficient to modulate the 
activity of said polypeptide. The compound can be, e.g., a small molecule, such as a nucleic 
acid, peptide, polypeptide, peptidomimetic, carbohydrate, lipid or other organic (carbon 
containing) or inorganic molecule, as further described herein. 

Also within the scope of the invention is the use of a therapeutic in the manufacture of 
a medicament for treating or preventing disorders or syndromes including, e.g., trauma, 
regeneration (in vitro and in vivo); Von Hippel-Lindau (VHL) syndrome; Alzheimer's 
disease; stroke; Tuberous sclerosis; hypercalceimia; Parkinson's disease, Huntington's 
disease; Cerebral palsy; Epilepsy; Lesch-Nyhan syndrome; multiple sclerosis; Ataxia- 
telangiectasia; leukodystrophies; behavioral disorders; addiction, anxiety, pain; actinic 
keratosis; acne; hair growth diseases; allopecia; pigmentation disorders; endocrine disorders; 
connective tissue disorders (such as severe neonatal Marfan syndrome dominant ectopia 
lentis, familial ascending aortic aneurysm and isolated skeletal features of Marfan syndrome); 
Shprintzen-Goldberg syndrome; genodermatoses; contractural arachnodactyly; inflammatory 
disorders such as osteo- and rheumatoid-arthritis; inflammatory bowel disease; Crohn's 
disease; immunological disorders; AIDS; cancers including but not limited to lung cancer, 
colon cancer, neoplasm, adenocarcinoma, lymphoma, prostate cancer, uterus cancer, 
leukemia or pancreatic cancer; blood disorders; asthma; psoriasis; vascular disorders, 
hypertension, skin disorders, renal disorders including Alport syndrome; immunological 
disorders; tissue injury; fibrosis disorders; bone diseases; Ehlers-Danlos syndrome type VI, 
VII, type IV, S-linked cutis laxa and Ehlers-Danlos syndrome type V; osteogenesis 
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imperfecta; neurologic diseases; brain disorders like encephalomyelitis; neurodegenerative 
disorders; immune disorders; hematopoietic disorders; muscle disorders; inflammation and 
wound repair; parasitic, bacterial, fungal, protozoal and viral infections (particularly 
infections caused by HIV-1 or fflV-2), acute heart failure; hypotension; hypertension; urinary 
retention; osteoporosis; treatment of Albright hereditary ostoeodystrophy; angina pectoris; 
myocardial infarction; ulcers; benign prostatic hypertrophy; arthrogryposis multiplex 
congenita; osteogenesis imperfecta; keratoconus; scoliosis; duodenal atresia; esophageal 
atresia; intestinal malrotation; pancreatitis; obesity; systemic lupus erythematosus; 
autoimmune disease; emphysema; scleroderma; allergy; ARDS; neuroprotection; fertility; 
Myasthenia gravis; diabetes; growth and reproductive disorders; hemophilia; 
hypercoagulation; idiopathic thrombocytopenic purpura; immunodeficiencies; graft versus 
host; adrenoleukodystrophy; congenital adrenal hyperplasia; endometriosis; xerostomia; 
ulcers; cirrhosis; transplantation; diverticular disease; Hirschsprung's disease; appendicitis; 
arthritis; ankylosing spondylitis; tendinitis; renal artery stenosis; interstitial nephritis; 
glomerulonephritis; polycystic kidney disease; erythematosus; renal tubular acidosis; IgA 
nephropathy; anorexia; bulimia; psychotic disorders; including schizophrenia, manic 
depression, delirium, and dementia; severe mental retardation and dyskinesias, and/or other 
pathologies and disorders of the like. 

The therapeutic can be, e.g., a NOVX nucleic acid, a NOVX polypeptide, or a 
NOVX-specific antibody, or biologically-active derivatives or fragments thereof 

For example, the compositions of the present invention will have efficacy for 
treatment of patients suffering from the diseases and disorders disclosed above and/or other 
pathologies and disorders of the like. The polypeptides can be used as immunogens to 
produce antibodies specific for the invention, and as vaccines. They can also be used to 
screen for potential agonist and antagonist compounds. For example, a cDNA encoding 
NOVX may be useful in gene therapy, and NOVX may be useful when administered to a 
subject in need thereof By way of non-limiting example, the compositions of the present 
invention will have efficacy for treatment of patients suffering from the diseases and 
disorders disclosed above and/or other pathologies and disorders of the like. 

The invention further includes a method for screening for a modulator of disorders or 
syndromes including, e.g., the diseases and disorders disclosed above and/or other 
pathologies and disorders of the like. The method includes contacting a test compound with a 
NOVX polypeptide and determining if the test compound binds to said NOVX polypeptide. 
Binding of the test compound to the NOVX polypeptide indicates the test compound is a 
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modulator of activity, or of latency or predisposition to the aforementioned disorders or 
syndromes. 

Also within the scope of the invention is a method for screening for a modulator of 
activity, or of latency or predisposition to disorders or syndromes including, e.g., the diseases 
and disorders disclosed above and/or other pathologies and disorders of the like by 
administering a test compound to a test animal at increased risk for the aforementioned 
disorders or syndromes. The test animal expresses a recombinant polypeptide encoded by a 
NOVX nucleic acid. Expression or activity of NOVX polypeptide is then measured in the 
test animal, as is expression or activity of the protein in a control animal which 
recombinantly-expresses NOVX polypeptide and is not at increased risk for the disorder or 
syndrome. Next, the expression of NOVX polypeptide in both the test animal and the control 
animal is compared. A change in the activity of NOVX polypeptide in the test animal 
relative to the control animal indicates the test compound is a modulator of latency of the 
disorder or syndrome. 

In yet another aspect, the invention includes a method for determining the presence of 
or predisposition to a disease associated with altered levels of a NOVX polypeptide, a NOVX 
nucleic acid, or both, in a subject (e.g., a human subject). The method includes measuring the 
amount of the NOVX polypeptide in a test sample from the subject and comparing the 
amount of the polypeptide in the test sample to the amount of the NOVX polypeptide present 
in a control sample. An alteration in the level of the NOVX polypeptide in the test sample as 
compared to the control sample indicates the presence of or predisposition to a disease in the 
subject. Preferably, the predisposition includes, e.g., the diseases and disorders disclosed 
above and/or other pathologies and disorders of the like. Also, the expression levels of the 
new polypeptides of the invention can be used in a method to screen for various cancers as 
well as to determine the stage of cancers. 

In a further aspect, the invention includes a method of treating or preventing a 
pathological condition associated with a disorder in a mammal by administering to the 
subject a NOVX polypeptide, a NOVX nucleic acid, or a NOVX-specific antibody to a 
subject (e,g, a human subject), in an amount sufficient to alleviate or prevent the pathological 
condition. In preferred embodiments, the disorder, includes, e.g., the diseases and disorders 
disclosed above and/or other pathologies and disorders of the like. 

In yet another aspect, the invention can be used in a method to identity the cellular 
receptors and downstream effectors of the invention by any one of a number of techniques 



commonly employed in the art. These include but are not limited to the two-hybrid system, 
affinity purification, co-precipitation with antibodies or other specific-interacting molecules. 

NOVX nucleic acids and polypeptides are further useful in the generation of 
antibodies that bind immuno-specifically to the novel NOVX substances for use in 
therapeutic or diagnostic methods. These NOVX antibodies may be generated according to 
methods known in the art, using prediction from hydrophobicity charts, as described in the 
"Anti-NOVX Antibodies" section below. The disclosed NOVX proteins have multiple 
hydrophilic regions, each of which can be used as an immunogen. These NOVX proteins can 
be used in assay systems for functional analysis of various human disorders, which will help 
in understanding of pathology of the disease and development of new drug targets for various 
disorders. 

The NOVX nucleic acids and proteins identified here may be useful in potential 
therapeutic applications implicated in (but not limited to) various pathologies and disorders as 
indicated below. The potential therapeutic applications for this invention include, but are not 
limited to: protein therapeutic, small molecule drug target, antibody target (therapeutic, 
diagnostic, drug targeting/cytotoxic antibody), diagnostic and/or prognostic marker, gene 
therapy (gene delivery/gene ablation), research tools, tissue regeneration in vivo and in vitro 
of all tissues and cell types composing (but not limited to) those defined here. 

Unless otherwise defined, all technical and scientific terms used herein have the same 
meaning as commonly understood by one of ordinary skill in the art to which this invention 
belongs. Although methods and materials similar or equivalent to those described herein can 
be used in the practice or testing of the present invention, suitable methods and materials are 
described below. All publications, patent applications, patents, and other references 
mentioned herein are incorporated by reference in their entirety. In the case of conflict, the 
present specification, including definitions, will control. In addition, the materials, methods, 
and examples are illustrative only and not intended to be limiting. 

Other features and advantages of the invention will be apparent from the following 
detailed description and claims. 

BRIEF DESCRIPTION OF THE DRAWINGS 

FlG.l depicts an electrophoresis profile for angiopoietin related protein (ARP), panel 
A and vascular endothelial growth factor (VEGF), panel B; and a TaqMan expression profile 
for VEGF (panel C) and for ARP (panel D). 



DETAILED DESCRIPTION OF THE INVENTION 

The present invention provides novel nucleotides and polypeptides encoded thereby. 
Included in the invention are the novel nucleic acid sequences and their encoded 
polypeptides. The sequences are collectively referred to herein as "NOVX nucleic acids" or 
•'NOVX polynucleotides" and the corresponding encoded polypeptides are referred to as 
"NOVX polypeptides" or "NOVX proteins." Unless indicated otherwise, 'TsfOVX" is meant 
to refer to any of the novel sequences disclosed herein. Table A provides a summary of the 
NOVX nucleic acids and their encoded polypeptides. 



TABLE !• Sequences and Corresponding SEQ ID Numbers 



NOVX 
No. 


Internal Acc. No. 




Nucleic 
Acid 
SEQID 
NO. 


Amino 
Acid 
SEQID 
NO, 


1 


CG56920-01 


Zinc Finger Protein-like Proteins 


1 


2 


2 


CG57107-01 


Pepsin A Precursor-like Protein 


3, 5,7, 
9,11 


4,6,8, 
10,12 


3 


CG56936-01 


Ribonuclease Pancreatic-like 
Proteins 


13 


14 


4 


CG51707-02 


Ser/Thr Protein Kinase-like 
Proteins 


15 


16 


5 


CG57081-01 


Ser/Thr Protein Kinase-like 
Proteins 


17 


18 


6 


CG56684-02 


Glycodelin-like Proteins 


19 


20 


/ 


CG56977-01 


Neuropathy Target Esterase/Swiss 
Cheese Protein-like Proteins 


21 


22 


8 


CG57119-01 


Acid-Sensitive potassium Channel 
Protein Task-like Proteins 


23 


24 


9 


CG57143-01 


Novel Ribosomal Protein L8-like 
Proteins 


25 


26 


10 


CG56860-01 


Prostaglandin Omega Hydroxylase- 
like Proteins 


27 


28 


11 


CG57024-01 


Myeloid Upregulated Protein-like 
Proteins 


29 


30 


12 


CG57083-01 


Testicular Serine Protease-like 
Proteins 


31 


32 


13a 


CG56961-01 


Hepatitis B Virus (HB V) 
Associated Factor-like Proteins 


33 


34 


13b 


CG56961-02 


Hepatitis B Virus (HB V) 
Associated Factor-like Proteins 


35 


36 


14 


CG57104-01 


Apolipoprotein L-like Proteins 


37 


38 


14b 


CG57104-02 


Apolipoprotein L-like Proteins 


39 


40 


15 


CG57I46-01 


Rh Type C Glycoprotein-like 

Protein 


41 


42 


16 


CG57169-01 


Copine Ill-like Protein 


43 


44 


17 


CG57177-01 


Carboxypeptidase B, Pancreatic- 
like Proteins 


45,47, 
49,51, 
53 


46,48, 

50, 52, 
54 
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18a 


CG571 13-01 


Ribosomal Protein L29-like 
Proteins 


55 


56 


18b 


CG571 13-02 


Ribosomal Protein L29-like 
Proteins 


57 


58 


19 


CG57211-01 


Metalloproteinase-Disintegrin 
(ADAM30>like Proteins 


59 


60 


20 


CG57222-01 


Bone Morphongenetic Protein 11- 
like Proteins 


61 


62 


21a 


CG56477-01 


Adrenomedullin Receptor-like 

Protein 


63 


64 


21b 


CG56477-02 


Adrenomedullin Receptor-like 
Protein 


65 


66 


21c 


CG56477-03 


Adrenomedullin Receptor-like 
Protein 


67 


68 


22a 


CG57256-01 


Protein Tyrosine Phosphatase-like 
Proteins 


69 


70 


22b 


CG57256-02 


Protein Tyrosine Phosphatase-like 

Proteins 


71 


72 


23 


CG57228-01 


Aldo-Keto Reductase Family 7, 
Member A3 like 


73 


74 


24 


CG57274-01 


Ral Guanine NucleotideExchange 
Factor 3-like Proteins 


75 


76 


25 


CG57276~01 


Endolyn-like Proteins 


77 


78 


26 


CG57224-01 


Arylacetamide Deacetylase-like 
Proteins 


79 


80 


27 


CG57288-01 


GPCR-like Proteins 


81 


82 


28 


CG572 13-01 


PB39-like Proteins 


83 


84 


29 


CG56990-02 


Oxytocin-like Proteins 


85 


86 


30a 


CG57330-01 


Thymosin beta-4-like Proteins 


87 


88 


30b 


CG57330-03 


Beta Thymosin-like Proteins 


89 


90 


30c 


CG57330-02 


Thymosin Beta-4-like Proteins 


91 


92 


31 


CG57344-01 


Myelin P2-like Proteins 


93 


94 


32a 


CG57346-01 


Testis Lipid-binding Protein-like 
Proteins 


95 


96 


32b 


CG57346-02 


Testis Lipid-binding Protein-like 
Proteins 


97 


98 


33 


CG57356-01 


Intracellular Thrombospondin 
Domain Containing Protein-like 
Protein 


99 


100 


34a 


CG57258-01 


Ornithine Decarboxylase-like 
Protein 


101 


102 


34b 


CG57258-02 


Ornithine Decarboxylase-like 
Protein 


103 


104 


34c 


CG57258-03 


Ornithine Decarboxylase-like 
Protein 


105 


106 


35 


CG57339-0r 


Short-chain 

Dehydrogenase/Reductase-like 

Protein 


107 


108 


36 


CG57341-01 


Short-chain 

Dehydrogenase/Reductase-like 
Protein 


109 


110 


37 


CG57335-01 


Protocadherin Beta 3-like Protein 


111 


112 
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NOVX nucleic acids and their encoded polypeptides are useful in a variety of 
applications and contexts. The various NOVX nucleic acids and polypeptides according to 
the invention are useful as novel members of the protein families according to the presence of 
domains and sequence relatedness to previously described proteins. Additionally, NOVX 
nucleic acids and polypeptides can also be used to identify proteins that are members of the 
family to which the NOVX polypeptides belong. 

NOVl is homologous to the Fibromodulin family of proteins. Thus, the NOVl 
nucleic acids, polypeptides, antibodies and related compounds according to the invention will 
be useful in therapeutic and diagnostic applications implicated in, for example, the treatment 
of patients suffering from: repair of damage to cartilage and ligaments; therapeutic 
applications to joint repair, and other diseases, disorders and conditions of the like. 

It has been suggested that fibromodulin participates in the assembly of the 
extracellular matrix by virtue of its ability to interact with type I and type II collagen fibrils 
and to inhibit fibrillogenesis in vitro. 

Additional utilities for the NOVX nucleic acids and polypeptides according to the 
invention are disclosed herein. 



NOVl 

A disclosed NOVla (designated CuraGen Acc. No. CG56290-01) encodes a novel 
Zinc Finger Protein-like protein and mcludes the 1319 nucleotide sequence (SEQ ID NO:l) is 
shown in Table lA. An open reading frame for the mature protein was identified beginning 
with an ATG initiation codon at nucleotides 445-447 and ending with a TAA stop codon at 
nucleotides 1228-1230. Putative untranslated regions are underlined in Table 1 A, and the 
start and stop codons are in bold letters. 



Table 1 A. NOVl Nucleotide Sequence (SEQ ID NO:l) 

ACAGCCACAGTGATTTCATCCTTCQATACAGGGGATATACTGTACAGTCCTTTTTCTAGAAGTGAGACATACA?VGA 

TTACTCTACAAGAGGAAGATTCCAGGGGCTCAAAAACGCTUVAGGTTTGCACTTTGAGAGCCCC^ 

AACTCAGGATCTAAAACAAAGTTCTGTGTTAATGAGTTACAGAATTCACGTGGAAGTCAATGTCACTTTATAATCG 

ATAATAATACTGAGTGAGGAACACTATGCAGGAAGAAACCTTCCGTAGAAAGACAGGC^GGGAAAAGCTTAGGCTG 

ACCTTAAACTTACCTAATAGAGCAAGCCTGAGATAGACTGCCAAAATGGCCAAATAAGAGACTCTATGAAATAACA 

GTCTTGTAACTGTAGTAATCATAAGGAAATTTTCTCCTTGAAATCACGATACCAAATAGGAAAAA TGATCTACAAG 

TGCCCCATGTGTAGGGAATTTTTCTCTGAGAGAGCAGATCTTTTTATGCATCAGAAAATTCACACAGCTGAGAAGC 

CCCATAAATGTGACAAGTGTGATAAGGGTTTCTTTCATATATCAGAACTTCATATTCATTGGAGAGACCATACAGG 

AGAGAAGGTCTATAAATGTGATGATTGTGGTAAGGATTTTAGCACTACAACAAAACTTAATAGACATAAGAAAATC 

CACACAGTGGAGAAGCCCTATAAATGTTACGAGTGTGGCAAAGCCTTCAATTGGAGCTCCCATCTTCAAATTCATA 

TGAGAGTTCATACAGGTGAGAAACCGTATGTCTGTAGTGAGTGTGGAAGGGGCTTTAGTAATAGTTCAAACCTTTG 

CATGCATCAGAGAGTCCACACCGGAGAGAAGCCCTTTAAATGTGAAGAGTGTGGGAAGGCCTTCAGGCACACCTCC 

AGCCTCTGCATGCATCAAAGAGTCCACACAGGAGAGAAACCCTATAAATGTTATGAGTGTGGGAAGGCGTTCAGTC 

AGAGTTCGAGCCTCTGCATCCACCAGAGAGTCCACACTGGAGAGAAACCCTATAGATGTTGTGGATGTGGGAAGGC 

CTTCAGTCAGAGTTCGGGCCTGTGCATCCACCAGAGAGTCCACACAGGAGAGAAACCTTTCAAATGTGATGAGTGC 

GGAAAGGCCTTCAGTCAGAGTACGAGCCTCTGCATCCACCAGAGAGTCCACACAAAGGAGAGAAACCATCYCAAAA 
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TATCAGTTATATA AAACGTTTTGCTAAGAGTTTAAAATCITAAAACCCATAAGTGCCACTAGGAAGGaAACCC^ 

ATCGAAGGATGAAATCACTGTGGCTGT 

For all BLAST data described herein, public nucleotide databases include all 
GenBank databases and the GeneSeq patent database; and public amino acid databases 
include the GenBank databases, SwissProt, PDB and FIR. 

The disclosed NOVl nucleic acid sequence maps to chromosome 12q243 and 
invention has 901 of 1057 bases (85%) identical to a gb:GENBANK- 

ID:GPIZFPA|acc:L26335.1 mRNA from Cavia porcellus (Cavia porcellus zinc finger protein 
(zfoCl) mRNA, complete cds) (E = L2e-^^. 

In all BLAST alignments herein, the "E-value" or "Expect" value is a numeric 
indication of the probability that the aligned sequences could have achieved their similarity to 
the BLAST query sequence by chance alone, within the database that was searched. For 
example, the probability that the subject ("Sbjcf ) retrieved from the NOVl BLAST analysis, 
e.g,, Cavia porcellus zinc finger protein mRNA, matched the Query NOVl sequence purely 
by chance is 1 .2x10'^^. The Expect value (E) is a parameter that describes the number of hits 
one can "expect" to see just by chance when searching a database of a particular size. It 
decreases exponentially with the Score (S) that is assigned to a match between two 
sequences. Essentially, the E value describes the random background noise that exists for 
matches between sequences. 

The Expect value is used as a convenient way to create a significance threshold for 
reporting results. The default value used for blasting is typically set to 0.0001 . In BLAST 
2.0, the Expect value is also used instead of the P value (probability) to report the 
significance of matches. For example, an E value of one assigned to a hit can be interpreted 
as meaning that in a database of the current size one might expect to see one match with a 
similar score simply by chance. An E value of zero means that one would not expect to see 
any matches with a similar score simply by chance. See, e.g., http://www.ncbi.nlm.nih.gov/ 
Education/BLASTinfo/. Occasionally, a string of X's or N's will result from a BLAST 
search. This is a result of automatic filtering of the query for low-complexity sequence that is 
performed to prevent artifactual hits. The filter substitutes any low-complexity sequence that 
it finds with the letter 'Ts[" in nucleotide sequence (e.g., "NNNNNNNNNNNNN") or the 
letter "X" in protein sequences (e.g., "XXXXXXXXX"). Low-complexity regions can result 
in high scores that reflect compositional bias rather than significant position-by-position 
alignment. Wootton and Federiien, Methods Enzymol 266:554-571, 1996. Other BLAST 
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results include sequences from the Patp database, which is a proprietary database that 
contains sequences published in patents and patent publications. 

A disclosed NOVl polypeptide (SEQ ID NO:2) is 261 amino acid residues in length 
and is presented using the one-letter amino acid code in Table IB. The SignalP, Psort and/or 
Hydropathy resufts predict that NOVl does not have a signal peptide and is likely to be 
localized to the mitochondrial matrix space with a certainty of 0.4401 . In alternative 
embodiments, a NOVl polypeptide is located to the microbody (peroxisome) with a certainty 
of 0.4294, the nucleus with a certainty of 0.3000, or in the mitochondrial inner membrane 
with a certainty of 0.1252. 

Table IB. Encoded NOVl Protein Sequence (SEQ ID NO:2) 

MIYKCPMCREFFSERADLF^mQKIHTAEKPHKCDKCX^KGFFHISELHIHWRDHTGEKVYKCDDCGKDFSTTTKI^ 
RHKKIHTVEKPYKCYECGKAFNWSSHLQIHMRVHTGEKPYVCSECGRGFSNSSNLCMHQRVHTGEKPFKCEECGK 
AFRHTSSLCMHQRVHTGEKPyKCYECGKAFSQSSSLCIHQRVHTGEKPYRCCGCGKAFSQSSGLCIHQRVHTGEK 
PFKCDECGKAFSQSTSLCIHQRVHTKERNHIiKISVI 

The NOVl amino acid sequence was found to have 258 of 261 amino acid residues 
(98%) identical to, and 259 of 261 amino acid residues (99%) similar to, the 261 amino acid 
residue ptnr:SPTREMBL-ACC:Q60493 protein from Cavia porcellus (Guinea pig) (ZINC 
FINGER PROTEIN) (E = 1 .9e-'^^). 

The Zinc Finger Protein-like gene disclosed in this invention is expressed in at least 
the following tissues: retina, and organ of Corti. Expression information was derived from 
the tissue sources of the sequences that were included in the derivation of the sequence of 
NOVl. 

Possible small nucleotide polymorphisms (SNPs) found for NOVl are listed in Tables 
IC and ID, where "PAF" is putative allelic frequency, the sign means is changed to, 
"N/A" refers to a silent mutation, and "Depth'' represents the number of clones covering the 
region of the SNP. 





Table IC: 


SNPs 






Consensus Position 


Depth 


Base Change 


PAP 


1084 


7 


G>A 


N/A 



Table ID: SNPs 


Variant ID 


Nucleotide 
Position 


Base Change 


Amino Acid 
Position 


Base Change 


13376980 


69 


A>G 


NA 


NA 
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13376981 



1081 



G>T 



213 



Gly>Ser 



Homologies to any of the above NOVl proteins will be shared by other NOVl 
proteins insofar as they are homologous to each other as shown above. Any reference to 
NOVl is assumed to refer to both of the NOVl proteins in general, unless otherwise noted. 

NOVl also has homology to the amino acid sequences shown in the BLASTP data 
listed in Table IE. 



Table IE. BLAST results for NOVl 


Gene Index/ 
Identifier 


Protein/ Orgeuiism 


I»ength 
(aa) 


Identity 
(%) 


Positives 
{%) 


Expect 


gi |2144127 [pir I | 
S70006 


finger protein 
2 foci - guinea pig 


261 


258/261 
(98%) 


259/261 
(98%) 


e-123 


gi|ll96461 |gb|AA 
C41997.l| 

{L41669) 


ZFOCl gene product 
[Homo sapiens] 


184 


181/184 
(98%) 


183/184 
(99%) 


6e-84 


gi |2135119 |pir | | 
S70007 


finger protein 
zfOCl - human 
{ fragment ) 


183 


180/183 
(98%) 


182/183 
(99%) 


2e-83 


gi [17445052 |ref| 
XP_060551.1 1 
(XM_060551) 


similar to zinc 
finger protein 85 
{HPF4, HTFl) [Homo 
sapiens] 


1147 


151/253 
(59%) 


187/253 
(73%) 


le-78 


gi|70X958l|ref |N 
P_037381.l| 
(NM 013249) 


zinc finger 
protein 214 [Homo 
sapiens] 


606 


155/246 
(63%) 


184/246 
(74%) 


le-76 



The homology of these sequences is shown graphically in the ClustalW analysis 
shown in Table IF. 



DNOVl 

2) gil2144127 

3) gi|ll96461 

4 ) gi I 2135119 

5) gi I 17445052 

6) gi|7019S81 



Table IF. ClustalW Analysis of NOVl 

(SEQ ID NO: 2) 
(SBQ ID NO: 113) 
(SEQ ID NO: 114) 
(SBQ ID NO: 115) 
(SEQ ID NO: 116) 
(SEQ ID NO: 117) 

10 20 30 40 50 60 

1 1 

1 1 

1 1 

1 1 

1 MPVKKGCQGPPKGMLRPCVPGFSVCASQSLISPAEVPGLRWACLQEQLVLGSGKSVELSC 60 
1 1 

70 80 90 100 110 120 
....|....|....|....|....|....|....|....|....|....|....|....| 
1 -- 1 

1 --- 1 
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NOVl 

gi 1 2144127 I 
gij 1196461 j 
gi 1 213 5119 I 
gij 17445052 I 
gij 7019581 I 



NOVl 

gi 12144127 I 



NOVl 


1 


gi 


2144127 1 


1 


gi 


11964611 


1 


gi 


2135119 1 


1 


gi 


17445052 1 


121 


gi 


7019581] 


1 



gi 1 1196461 1 
gi [2135119 | 
gi 1 17445052 I 
gi |7019581| 



NOVl 

gi I 2144127 I 
gi 1 1196461 1 
gi I 2135119 I 
gi 1 17445052 I 
gi 1 7019581 1 



NOVl 

gi 1 2144127 I 
gi 1 11964611 
gi 12135119 1 
gi 1 17445052 I 
gi [7019581 I 



NOVl 

gi I 2144127 | 
gi|ll96461| 
gi 12135119 | 
gi I 17445052 j 
gi 17019581 1 



NOVl 

gi I 2144127 I 
gi |1196461| 
gi|2135119| 
gi 1 17445052 I 
gij 7019581 I 



NOVl 

gi I 2144127 | 
gij 1196461 1 
gi 12135119 j 
gij 17445052 1 
gi 17019581 1 



NOVl 

gi 1 2144127 I 
gi|il96461| 
gi 12135119 I 
gi|l7445052| 
gi |7019581[ 



NOVl 

gi [2144127 | 



1 
1 

61 
1 



1 
1 
1 
1 

181 
33 



1 
1 
1 
1 

241 
73 



1 
1 
1 
1 

301 
106 



1 
1 
1 

1 

361 
138 



1 
1 
1 
1 

421 
146 



1 
1 
1 
1 

481 
170 



HPPGRGPMELTVGVKGSAGLPGTSSWGSTIYAPPGSGIPPLPPRRRHSTRSLACCNSIHS 120 
i 



130 



140 



150 



160 
!....[.. 



170 



180 



1 

- 1 

2 

1 

SGSi^TVQAGGRGGQGQRAAFPGGRTLPSPVTRKTVTVHPESHCQQIJI^ 180 
MAVTPEDVTIIFTWEEWKFLDSSQKRLYREVM 32 



190 



200 



210 



220 



230 



240 



1 

^ 



3^ 

ASGPMGTLGVRAIiARQTGAVYKSRGPPQQVDRKEQIKGKPYETHLQRNQPIQEKTRFRAP 24 0 
WENYTNVMSVENWH-ES YKSQ EEKFRYLEYENFSYWQG WWNA- 73 



250 260 270 280 

..[....[....|..-.[....|. ,..[....[.. 



290 300 
-.[....[.... I 



2 

1 

3^ 

-L 

IJUIPRGRPCRPVIAQLKHPPPYPSLLKGALCTGAERFIJSKALWLSI^SPSTIJJPTLSCSK 300 
G AQMi^EtSQNY GETVQGTD SKDL TQQDRSQCQB 105 



310 320 330 340 



350 360 
..[....[.... I 



j_ 

^ 

j_ 

2 

GPCLPEQNTPSPRLYGSRAQUtPKWKGPFRSPKCAGQLTSHGKSLVPCGHREAMI AACP 360 
WLILSTQ- VPG YGN Y ELTFESKSLRNLKYKNFMP 13 8 



370 



380 390 



400 



410 420 



1 

1 

3_ 

-j_ 

HGKAFWSLHVRVQLWQQRTFPVLEILSVWQGl^TPTQPPSAASCQLWEDVDWCLVHLSSC 420 
WQSLETKT 146 



430 



450 



|....|, 



460 



470 



480 



]_ 

3_ 

2_ 

-}_ 

GCSRSVDKAQVSSKATTENAQDVIRALKMPGRVEGKMQKLQEGKVNLEKDLEKESNRDAV 480 
TQDYGREIYMSG SHGFQGGRYRL6 170 



490 



500 



510 



520 



530 



540 



^ 

-j_ 

3_ 

3_ 

TALRTVDDLVI IKPMHI*SGHSQDIHLHLCSSQEEAIRAAQWLVQEALPLVPWGKDI.QWQH 54 0 
ISRKNLS MBKEQKr.IV QH 188 



550 



560 



570 



580 



590 



600 
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gi 1 1196461 1 
gi [2135119 I 
gi I 17445052 [ 
gi I 7019581 I 



NOVl 

gi 1 2144127 1 
9i|119G4ei| 
gi 1 2135119 1 
gi 1 17445052 I 
gi^ 7019581 I 



NOVl 

gi 1 2144127 | 
gi [11964 611 
gi I 2135119 I 
gi 1 17445052 I 
gi|7019581| 



NOVl 

gi I 2144127 | 
gi I 1196461 I 
gi 1 2135119 I 
gi 1 17445052 I 
gi [7019581 1 



NOVl 

gi I 2144127 [ 
gi I 1196461 [ 
gi 12135119 1 
gi 1 17445052 I 
gi [7019581 I 



NOVl 

gi 1 2144127 I 
gi 1 1196461 1 
gi 1 2135119 I 
gi [17445052 I 
gi 1 7019581 1 



NOVl 

gi [2144127 I 
gi 1 1196461 1 
gi [2135119 I 
gi 1 17445052 I 
gi I 7019581 I 



NOVl 

gi [2144127 [ 



1 
1 

541 
188 



1 
1 
1 
1 

721 
221 



9 
9 
1 
1 

781 
253 



NOVl 


19 


gi 


2144127 1 


19 


gi 


1196461 1 


1 


gi 


21351191 


1 


gi 


17445052 1 


841 


gi 


7019581 1 


313 



X 

1 

GTYNALSADDAVQSPPDCSEDATNSCLTITRVTECIRESLCFKQCLTGQFLPBQVHFTLF 600 
-Sy--IPVEBALP QYV 201 



610 



620 



630 
[....[.. 



640 



650 



660 



601 SWSQIKNSAHGTFCKYGIiLAFSDWIEFSPEEWACLDPAQRNLYRDVMFENYRNLVSLDL 660 
201 GVIC QEDLLRDSMEE 216 



670 
..[.. 



680 



690 



700 



710 



720 



.[.... [....[.... I 



661 LPEQDMKDLCQKVTLTRHRSWGLDNLHLVKDWRTVNEGKGQKEYCNRLTQCSSTKSKIFQ 720 
216 KYCG 220 



730 



740 



750 



760 



CIECGRNFSWRSILTEHKRIHTGEKPYKCEECGKVFNRCSNLTKHKRIHTGEKP" 
CNKCKGIYYWN SRC- -VF HKRNQPGENLi 




790 



800 



810 




820 



830 



840 



[....[ 



19 

19 

1 

1 



GK^^^WV^ojJrNHKKIHTGEKPYKCDECDKVFlSrWWSQLTSHKKIH^ 840 
KAC^QI^DaYRHPRtraiGKKLYGCDEVDGNFHQSSGVHFHQRVHXGEVPYSCNACGKSF 312 





1050 1060 
-.1.-- 



1080 



:iHQRVHTGEKPYRCCGCGKA- 
:iHQR^/HTGEKPYRCCGCGKAF 
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git 11964611 
gi 12135119 I 
gi 1 17445052 | 
gi 1 7019581 1 




NOVl 


225 


gi| 2144127 | 


225 


gi (11964611 


134 


gi [21351191 


183 


gij 17445052 1 


1081 


gi 17019581 1 


552 



1090 




....|... 



NOVl 


261 


261 


gi| 2144127 1 


261 


261 


giill9646lj 


184 


184 


gii2135119| 


183 


183 


gij 174450521 


1140 


KCKKCX3SL 1147 


gii701958l| 


603 


GNL 606 



1130 1140 

Sl^KISVI 261 

fl^KISVI 261 

184 

183 

lYKCKECGKGF-y^JsiHSKYKRIYTGEEPD 1139 
►YKCREYYKGFDm^l^f^'NHRR 603 



Tables IG and IH list the domain description from DOMAIN analysis results against 
NOVL This indicates that the NOVl sequence has properties similar to those of other 
proteins known to contain these domains. The presence of identifiable domains in NOVl, as 
well as all other NOVX proteins, was determined by searches using software algorithms such 
as PROSITE, DOMAIN, Blocks, Pfam, ProDomain, and Prints, and then determining the 
Interpro number by crossing the domain match (or numbers) using the Interpro website 
(http:www.ebi.ac.uk/ interpro). DOMAIN results may be collected from the Conserved 
Domain Database (CDD) with Reverse Position Specific BLAST analyses. This BLAST 
analysis software samples domains found in the Smart and Pfam collections. Sequences may 
also be analyzed according to a hmmpfam search against the HMM database (HMMER 2. LI 
(Dec 1998), Copyright (C) 1992-1998 Washington University School of Medicine). 
HMMER is freely distributed under the GNU General Public License. 

For Table IG and all successive DOMAIN sequence alignments, aligned residues are 
displayed in uppercase, residues identical (conserved) in the alignment between query 
(NOVX) and representative are shown in the extra line (|) between the two sequences, similar 
residues ("strong," semi-conserved, with a positive score in the BLOSUM62 matrix) are 
indicated with a Regions masked out due to composition-bias are displayed in italics. 
The "strong" group of conserved amino acid residues may be any one of the following groups 
of amino acids: STA, NEQK, NHQK, NDEQ, QHRK, MILV, MILF, H Y, FYW. 
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Table IG. Domain Analysis of NOVl 



HT®! file: pfaraHMMs 

Scores for sequence family classification (score includes all domains) : 



Model 


Description 












Score 




-value 


zf-C2H2 


(InterPro) 


Zinc finger. 


C2H2 


type 




227.3 


2 


.2e-64 


Parsed for domains 




















Model 


Domain 


seq- 


seq- 


to 


hitun- 


hmm-to 


score 


E- 


-value 






from 






from 












zf-C2H2 


1/9 


3 


25 




1 


24 


[1 


28.5 


0 


.00016 


zf-'C2H2 


2/9 


31 


53 




1 


24 


[] 


21.4 




0.021 


zf-C2H2 


3/9 


59 


81 




1 


24 


[] 


32.4 




le-05 


2f-C2H2 


4/9 


87 


109 




1 


24 


tl 


35.6 


1 


.le-06 


zf~C2H2 


5/9 


115 


137 




1 


24 


El 


35.4 


1 


.3e-06 


zf-C2H2 


6/9 


143 


165 




1 


24 


[} 


32.8 




8e-06 


zf-C2H2 


7/9 


171 


193 




1 


24 


[3 


34.1 


3 


.3e-06 


zf-C2H2 


8/9 


199 


221 




1 


24 


[] 


32.3 


1 


.le-05 


zf-C2H2 


9/9 


227 


249 




1 


24 


[] 


34.1 


3 


.2e-06 



9 



For example. Table IH depicts the alignment of several regions of NOVl with the 
zinc finger C2H2 consensus pattern YKCPFDCGKSFSRKSNLKRHLRTH (SEQ ID 
NO:118). 



Table IH. Alignments of top-scoring domains for NOVl 



zf-C2H2: 


domain 


1 


of 9, from 3 to 25 : score 28.5, E 


= 0.00016 










*->ykCpf dCgksFsrksnLkrHlrtH<-* 












MM 1 ^ II ^^-^1 -I^^M 






NOVl 




3 


YKCP -MCREFFSERADLFMHQKIH 2 5 


(SEQ ID NO 


:119) 


Zf-C2H2: 


domain 


2 


of 9, from 31 to 53 t score 21.4, B 


= 0.021 








*_ 


>ykCpf dCgksFsrksnLkrHlrtH<- * 


















NOVl 


31 




HKCD- KCDKGFFHISELHIHWRDH 53 


(SEQ ID NO: 


120) 


Zf-C2H2: 


domain 


3 


of 9, from 59 to 81: score 32.4, E 


= le-05 








* _ 


>ykCpfdCgksFsrksnI.krHlrtH<-* 












111+ Nil Ih 






NOVl 


59 




YKCD-DCGKDPSTTTKI*NRHiaCIH 81 


(SEQ ID NO: 


121) 


Zf-C2H2: 


domain 


4 


of 9, from 87 to 109: score 35.6, 


B = l.le-06 








*- 


>ykCpfdCgksFsrksnLkrHlrtH<-* 












III HIIH+ H + h hhl 






NOVl 


87 




YKCY-ECGKAFNWSSHI*QIHMRVH 109 


(SEQ ID NO: 


122) 


zf-C2H2: 


domain 


5 


of 9, from 115 to 137: score 35.4, 


E = 1.3e-06 






*_ 


>ykCp f dCgksFs rksnLkrHl rtH< - * 












1+1+ +11+ ll++ltl +I+I+I 






NOVl 


115 




YVCS - ECGRGFSNSSNLCMHQRVH 13 7 


(SEQ ID NO: 


123) 


zf-C2H2: 


domain 


6 


of 9r from 143 to 165: score 32.8, 


E = 8e-06 








*- 


>ykCpfdCgksFsrksnIikrHlrtH<-* 












+11+ +lil+l++ l+l +I+I+I 






NOVl 


143 




FKCB-ECGKAFRHTSSIiCMHQRVH 16 5 


(SEQ ID NO: 


124) 
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zf -C2H2 : 


domain 


7 


ot "/ tiOm J. / X uo ±27^. fak^t^iti 


■^4 1 


E = 


3 . 3e-06 






*. 


->ykCpfdCgksFsrksnLkrHlrtH<-* 














ill +lll + ll++hl Ul + I 








NOVl 


171 




YKCY-ECGKAFSQSSSLCIHQRVH 


193 


(SEQ 


ID NO: 125) 


zf-C2H2: 


domain 


8 


of 9, from 199 to 221: score 


32,3, 


E = 


l.le-05 






* 


->ykCpf dCgksFs rksnLkrHl rtH<-* 














1 + 1 +tll + ll-H 1 M + l 








NOVl 


199 




YRCC-GCGKAFSQSSGLCIHQRVH 


221 


(SEQ 


ID NO: 126) 


zf-C2H2: 


domain 


9 


of 9, from 227 to 249: score 


34.1, 


E = 


3 .2e-06 






* 


- >ykCp f dCgks F s rksnLkrHl r t H< - * 














+ 11+ +llhlh+ ^1 l^hl 








NOVl 


227 




FKCD-ECGKAFSQSTSLCIHQRVH 


249 


(SEQ 


ID NO: 127) 



Zinc finger domains nucleic acid-binding protein structures first identified in the 
Xenopus transcription factor TFIIIA. These domains have since been found in numerous 
nucleic acid-binding proteins. A zinc finger domain is composed of 25 to 30 amino-acid 
residues. There are two cysteine or histidine residues at both extremities of the domain, 
which are involved in the tetrahedral coordination of a zinc atom. It has been proposed that 
such a domain interacts with about five nucleotides. 

Many classes of zinc fingers are characterized according to the number and positions 
of the histidine and cysteine residues involved in the zinc atom coordination. In the first class 
to be characterized, called C2H2, the first pair of zinc coordinating residues are cysteines, 
while the second pair are histidines. A number of experimental reports have demonstrated 
the zinc- dependent DNA or RNA binding property of some members of this class. 

A cDNA encoding a novel member of the zinc finger gene family, designated zfOCl, 
has been cloned from the organ of Corti. This cDNA is the first transcriptional regulator 
cloned from this sensory epithelium. This transcript encodes a peculiar protein composed of 
9 zinc finger domains and a few additional amino acids. The deduced polypeptide shares 
66% amino acid similarity with MOK-2, another protein of only zinc finger motifs and 
preferentially expressed in transformed cell lines. Northern blot hybridization analysis 
reveals that zfOCl transcripts are predominantly expressed in the retina and the organ of 
Corti and at lower levels in the stria vascularis, auditory nerve, tongue, cerebellum, small 
intestine and kidney. Because of its relative abundance in sensorineural structures (retina and 
organ of Corti), this regulatory gene should be considered a candidate for hereditary disorders 
involving hearing and visual impairments that link to 12q24.3. 

The protein similarity information, expression pattern, cellular localization, and map 
location for the NOVl protein and nucleic acid disclosed herein suggest that this zinc finger 
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protein-like protein may have important structural and/or physiological functions 
characteristic of the zinc finger protein family. Therefore, the nucleic acids and proteins of 
the invention are useful in potential diagnostic and therapeutic applications and as a research 
tool. These include serving as a specific or selective nucleic acid or protein diagnostic and/or 
prognostic marker, wherein the presence or amount of the nucleic acid or the protein are to be 
assessed. These also include potential therapeutic applications such as the following: (i) a 
protein therapeutic, (ii) a small molecule drug target, (iii) an antibody target (therapeutic, 
diagnostic, drug targeting/cytotoxic antibody), (iv) a nucleic acid useful in gene therapy (gene 
delivery/gene ablation), (v) an agent promoting tissue regeneration in vitro and in vivo, and 
(vi) a biological defense weapon. 

The nucleic acids and proteins of the invention have applications in the diagnosis 
and/or treatment of various diseases and disorders. For example, the compositions of the 
present invention will have efficacy for the treatment of patients suffering from: deafness, 
blindness as well as other diseases, disorders and conditions. 

The novel nucleic acid encoding the Zinc Finger Protein-like protein of the invention, 
or fragments thereof, are useful in diagnostic applications, wherein the presence or amount of 
the nucleic acid or the protein are to be assessed. These materials are further useful in the 
generation of antibodies that bind immunospecifically to the novel substances of the 
invention for use in therapeutic or diagnostic methods. These antibodies may be generated 
according to methods known in the art, using prediction from hydrophobicity charts, as 
described in the "Anti-NOVX Antibodies" section below. The disclosed NOVl protein has 
multiple hydrophilic regions, each of which can be used as an immunogen. In one 
embodiment, a contemplated NOVl epitope is from about amino acids 20 to 22 In another 
embodiment, a contemplated NOVl epitope is from about amino acids 30 to 40. In other 
specific embodiments, contemplated NOVl epitopes are from about amino acids 52 to 57, 70 
to 80, 90 to 92, 105 to 120, 130 to 150, 160 to 180, 190 to 210, 220 to 240, and 245 to 248. 

NOV2 

A disclosed NOV2 nucleic acid (designated as CuraGen Acc. No. CG571 07-01), 
which encodes a novel Pepsin A Precursor-like protein includes the 1688 nucleotide sequence 
(SEQ ID NO:3) shown in Table 2A. An open reading frame for the mature protein was 
identified beginning with and ATG codon at nucleotides 306-308 and ending with a TAA 
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codon at nucleotides 1518-1520. Putative untranslated regions are underlined in Table 2A, 
and the start and stop codons are in bold letters. 



Table 2A, NOV2 Nucleotide Sequence (SEQ ID NO:3) 

TGCCTOTAGAGTTCAGCTGGTCAGGTGCGAGCACrGTCAAGCTAGCAGGGGCCrCCAC^ 

CCAAGGCAGCGGTAAGTGCCCTCATCACTGGGACGCACAGCerGQATCTGCAGCCT 

GGGGTCCACCCCTAAACTGCACAGAGATGTGGGGGTCa^TCCCCTGGCAGCTGGATGTCCaA^ 

CTCGATGGAGGCCATGGGGTAGGCAAACACITCACAGCCAA^ 

TATGGATGTGAC71CGATCTTCTCCCTCGAGTTGGGACCCGGGAA<^ 

GTGGCGCTCTCTGAGTGCATmTGTACMGGTCCCCCTCaiTC^ 

GTGGCCTGCTGAAGGACTTCCrGAAGAAGCACAACCTCAACCCAGCCAGAAAGTACTTC 

CACCCTGCTAGATGAACAGCCCCTGGAGAACTACCTGGATATGGAGTACTTCGGO^C^ 

GCCCAGGATTTCACTGTCCTCTTTGACACCGGCTCCTCCM^CCTGTGGGTGCCCrrCAGTCT 

CCTGCT^CCAACCACAACCGCTTCaACCCTGAGGATTCTTCCACCTACCAG^ 

CTACGGCACCX3GCAGCATGAaVGGCM*CCTCGGATACX3ACACr6TCCAGGTO 

ATCrTCGGCCTGAGCGAGACGGAACCTGGCTCCTTCCrGTATTATGCTCCCrrT^ 

ACCCCAGmTTTCCTCCTCCGGGGCCaCACCCGTCTTTGACAACATCTGGAAC 

CrTCTCTGTCrACCTCa«K:GCCGATGACCAGAGTGGCAGCX5T^ 

ACTGGAAGTCTCAACTGGGTGCCTGTTACCGTCGAGGGTTACTGGCAGATCACCGTGGAC^^ 

GAGAGGCCATCGCCTGCGCTGAGGGCTGCCyiGGCCATTGTTGACACCGGCT^CCTCTCTGCTGACCGGCCC^ 

CCCCATTGCCAACATCCAGAGCGACATCGGAGCCAGCGAGAACTCAGATGGCGACATGGTGGTCAGCTGCTCAGCC 

ATCAGCAGCCTGCCCGACATCGTCTTCACCATCAATGGAGTCCAGTACCCCGTGCCACCCAGTGCCTACATCCTGC 

AGAGCGAGGGGAGCTGCATCAGTGGCTTCCAGGGCATGAACCTCCCCACCGAATCTGGAGAGCTTTGGATCCTGGG 

TGATGTCTTCATCCGCCAGTACTTTACCGTCTTCGACAGGGCAAACAACCAGGTCAGCCTGGCCCCCGTGGCTTAA 

GCCTAAGTCTCTTCAGCCACCTCCCAQGAAGATCTGGCCTCTGTCCTGTGCCCACTTTAGATGTATCTAATTCTCC 

TGACTGTTCTTCCCAGGGGAGTGTGGAGGTCTTGGCCCTGTTCCCTGTCCTACCAATAACGTAGAATAAAAACATA 

ACCCACCAAAAAAAAA 



The nucleic acid sequence of NOV2 maps to chromosome 10q24 has 1285 of 1352 
bases (95%) identical to a gb:GENBANK-ID:MFPEPA231acc:X59755.1 mRNA from 
Macaca fuscata (M.fiiscata mRNA for pepsinogen A-2/3) (E = 5.6e*^^^). 

A disclosed NOV2 polypeptide (SEQ ID NO:4) is 404 amino acid residues in length 
and is presented using the one-letter amino acid code in Table 2B. The SignalP, Psort and/or 
Hydropathy results predict that NOV2 is likely to be localized at the endoplasmic reticulum 
(membrane) with a certainty of 0.6000. In alternative embodiments, a NOV2 polypeptide is 
located to the microbody (peroxisome) with a certainty of 03788, the mitochondrial inner 
membrane with a certainty of 0.2567, or the plasma membrane with a certamty of 0.1000. 
The SignalP predicts a likely cleavage site for a NOV2 peptide between amino acid positions 
31 and 32, Le. at the sequence SEC-IM. 



Table 2B. Encoded NOV2 Protein Sequence (SEQ ID NO:4) 

WEAPTIiVDEQPLENYIiDMEYFGTIGIGTPAQDFTVLFDTGSSlSnaWPSVycSSIiACTmi^^ 
TSETVSITYGTGSMTGILGYDTVQVGGISDTNQIFGLSETEPGSFIiyYAPFDGII.GLAYPSISSSGATPVFD 
NIWNQGLVSQDLFSVYLSADDQSGSVVXFGGIDSSYYTGSIiNWPVTVEGYWQITVDSITm 
QAIVDTGTSLLTGPTS PI ANI QSDIGASENSDGDMVVSCSAI SSLPDIVFTINGVQYPVPPSAY ILQSEGSC 
ISGFQGMNLPTES(a:LWILGDVFIRQYFTVJ?X)RAKNQVSLAPVA 
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The NOV2 amino acid sequence was found to 385 of 388 amino acid residues (99%) 
identical to, and 387 of 388 amino acid residues (99%) similar to, the 388 amino acid residue 
ptnr:SWISSNEW-ACC:P00790 protein from Homo sapiens (Human) (PEPSIN A 
PRECURSOR (EC 3.4.23.1)) (E = l.Oe'^^**). 

NOV2 is expressed in at least the following tissues: stomach and testis. Expression 
information was derived from the tissue sources of the sequences that were included in the 
derivation of the sequence of NOV2. 

Possible small nucleotide polymorphisms (SNPs) found for NOV2 are listed in Tables 

2C. 



Table 2C: SNPs 


Variant 


Nucleotide 
Position 


Base Change 


Amino Acid 
Position 


Base Change 


13374720 


386 


G>A 


NA 


NA 


13374721 


525 


G>A 


74 


Glu>Lys 



Also included in the invention are four variants of NOV2: NOV2a (designated as 
CuraGen Acc, No. 175069704), NOV2b (designated as CuraGen Acc. No. 175069720), 
NOV2c (designated as CuraGen Acc. No. 175069724), and NOV2d (designated as CuraGen 
Acc. No. 175069728). An alignment of these sequences is given in Table 2D. 



Table 2D: NOV2 variants 






10 20 30 40 50 60 


N0V2a 
NOV2b 
H0V2C 
NOV2<i 


1 
1 
1 
1 


GTCGACAGCCACGGGGGCCAGGCTGACCTGGTTGTTTGCCCTGTCGAAGACGGTAAAGTA^^^^^M 
GTCGACAGCCACGGGGGCCAGGCTGACCTGGTTGTTTGCCCTGTCGAAGACGGTAAAGT^'.^^^^^B 
GTCGACAGCCACGGGGGCCAGGCTGACCTGGTTGTTTGCCCTGTCGAAGACGGTAAAGTA^^^^^B 
GTCGACAGCCACGGGGGCCAGGCTGACCTGGTTGTTTGCCCTGTCGAAGACGGTAAAGTA^^^^^H 






70 80 90 100 110 120 


NOV2a 
NOV2b 

Nrov2c 

NOV2d 


61 
61 
61 
61 


CTGGCGGATGAAGACATCACCCAGGATCCAAAGCTCTCCAGATTCGGTGGGGAGGTTCATB^^^^B 
CTGGCGGATGAAGACATCACCCAGGATCCAAAGCTCTCCAGATTCGGTGGGGAGGTTCAtB^^^^B 

ctggcggatgaagacatcacccaggatccaaagctctccagattcggtggggaSgttcatB^^^M 
ctggcggatgaagacatcacccaggatccaaagctctccagattcggtggggaggttcatBH^^H 






130 140 150 160 170 180 

j 1 1 1 1 . . ..I 1 1 } j — i — 1 


N0V2a 
N0V2b 
NOV2C 
N0V2d 


121 
121 
121 
121 


gccctggaagccactgatgcagctcccctcgctctgcaggatgtaggcactgggtggcac»^^^^B 
gccctggaagccactgatgcagctcccctcgctctgcaggatgtaggcactgggtggcacB^^^^H 
gccctggaagccactgatgcagctcccctcgctctgcaggatgtaggcactgggtggcacB^^^^B 
grcrtggaagccactgatgcagctcccctcgctctgcaggatgtaggcactgggtggcacbefi^^h 






190 200 210 220 230 240 


NOV2a 
NOV2b 
NOV2C 
NOV2d 


181 
181 
181 
181 


ggggtactggactccattgatggtgaagacgatgtcgggcaggctgctgatggctgagc^-w^^^m 
ggggtactggactccattgatggtgaagacgatgtcgggcaggctgctgatggctgagcaR^^^B 
ggggtactggactccattgatggtgaagacgatgtcgggcaggctgctgatggctgagca^^^^^h 

GGGGTACTGGACTCCATTGATGGTQAAGACGATGTCGGGCAGGCTGCTGATGGCTGAGCJ!>EE^^H 









21 



250 

1 . 



260 



[ . 



270 
. . 1 . . 



280 



1 . 



I 



290 



300 



I . 



N0V2a 241 

NOV2b 241 

NOV2C 241 

NOV2d 241 



NOV2a 301 

N0V2b 301 

NOV2C 301 

N0V2d 301 



NOV2a 361 

NOV2b 361 

N0V2C 361 

N0V2d 361 



NOV2a 421 

IIOV2b 421 

NOV2C 421 

N0V2d 421 



N0V2a 481 

NOV2b 481 

N0V2C 481 

NOV2d 481 



NOV2a 541 

NOV2b 541 

HOV2C 541 

N0V2d 541 



NOV2a 601 

NOV2b 601 

NOV2c 601 

NOV2d 601 



NOV2a 661 

NOV2b 661 

NOV2C 661 

NOV2d 661 



N0V2a 721 

NOV2b 721 

KOV2C 721 

N0V2d 721 



NOV2a 781 

NOV2b 781 

N0V2C 781 

N0V2d 781 



^TGACCACCATGTCGCCATCTGAGTTCTCGCTGGCTCCGATGTCGCTCTGGATGTTGGC 
-TGACCACCATGTCGCCATCTGAGTTCTCGCTGGCTCCGATGTCGCTCTGGATGTTGGC 
-TGACCACCATGTCGCCATCTGAGTTCTCGCTGGCTCCGATGTCGCTCTGGATGTTGGC 
CTTGACCACCATGTCGCCATCTGAGTTCTCGCTGGCTCCGATGTCGCTCTGGATGTTGGC 



300 
300 
300 
300 



I . 



310 
. . I . . 



I . 



320 



1 



330 
. . I . . 



340 



350 



360 



I 



AATGGGGCTGGTTGGGCCGGTCAGCAGAGAGGTGCCGGTGTCAACAATGGCCTGGCAGCCj 
AATGGGGCTGGTTGGGCCGGTCAGCAGAGAGGTGCCGGTGTCAACAATGGCCTGGCAGCC! 
AATGGGGCTGGTTGGGCCGGTCAGCAGAGAGGTGCCGGTGTCAACAATGGCCTGGCAGCC 
AATGGGGCTGGTTGGGCCGGTCAGCAGAGAGGTGCCG GTGTCAACAATGGCCTGGCAGCC 



360 
360 
360 
360 



370 



380 



390 



420 




430 



440 



450 



I 



460 



470 



480 



1 



CCCTCGACGGTAACAGGCACCCAGTTCAGACTTCCAGTGTAGTAAGAAGAGTCAATGCCj 
aCCCTCGACGGTAACAGGCACCCAGTTCAGACTTCCAGTGTAGTAAGAAGAGTCAATGCC 
?.CCCTCGACGGTAACAGGCACCCAGTTCAGACTTCCAGTGTAGTAAGAAGAGTCAATGCC: 
ArrrTCGAGGGTAACAGGCACCCAGTTCAGACTTCCAGTGTAGTAAGAAG AGTCAATGCC 



480 
480 
480 
480 



490 



1 . 



500 



I 



510 



520 



53 0 



I . 



I . 



540 



CCAAAGATCACCACGCTGCCACTCTGGTCATCGGCGCTGAGGTAGACAGAGAAGAGGTC: 
ACCAAAGATCACCACGCTGCCACTCTGGTCATCGGCGCTGAGGTAGACAGAGAAGAGGTC 
ACCAAAGATCACCACGCTGCCACTCTgGTCATCGGCGCTGAGGTAGACAGAGAAGAGGTC 
ACCAAAGATCACCACGCTGCCACTCTGGTCATCGGCGCTGAGGTAGACAGAGAAGAGGTC 



550 



560 



570 



I 



580 



590 



600 



CTGAGAAACCAGGCCCTGGTTCCAGATGTTGTCAAAGACGGGTGTGGCCCCGGAGGAGGA 
CTGAGAAACCAGGCCCTGGTTCCAGATGTTGTCAAAGACGGGTGTGGCCCCGGAGGAGGA 
CTGAGAAACCAGGCCCTGGTTCCAGATGTTGTCAAAGACGGGTGTGGCCCCGGAGGAGGA 
CTGAGAAACCAGGCCCTGGTTCCAGATGTTGTCAAAGACGGGTGTGGCCCCGGAGGAGGJ\ 



I , 



610 



I . 



620 



630 



640 



650 



660 



AATGCTGGGGTAGGCCAGCCCCAGGATGCCATCGAAGGGAGCATAATACAGGAAGGAGCC 
AATGCTGGGGTAGGCCAGCCCCAGGATGCCATCGAAGGGAGCATAATACAGGAAGGAGCC 
AATGCTGGGGTAGGCCAGCCCCAGGATGCCATCGAAGGGAGCATAATACAGGAAGGAGCC 
AATG CTGGGGTAGGCC AGCCCC AGGATGScAT CGAAGGGAGCAT AATACAGGAAGGAGCC 



660 
660 
660 
660 



690 



710 



I - 



GGTTCCGTCTCGCTCAGGCCGAAGATCTGATTGGTGTCAGAGATGCCTCCAACCTGGAC 
f^GGTTCCGTCTCGCTCAGGCCGAAGATCTGATTGGTGTCAGAGATGCCTCCAACCTGGAC 
i^GGTTCCGTCTCGCTCAGGCCGAAGATCTGATTGGTGTCAGAGATGCCTCCAACCTGGAC 
'GGTTCCGTCTCGCTCAGGCCGAAGATCTGATTGGTGTCAGAGATG CCTCCAA CCTGGAC 



I . 



730 

I 



740 

I 



I 



750 

1 . 



I . 



760 



770 



780 



GTGTCGTATCCGAGGATGCCTGTCATGCTGCCGGTGCCGTAGGTGATGGAGACTGTCTC: 
?.GTGTCGTATCCGAGGATGCCTGTCATGCTGCCGGTGCCGTAGGTGATGGAGACTGTCTC 
f\GTGTCGTATCCGAGGATGCCTGTCATGCTGCCGGTGCCGTAGGTGATGGAGACTGTCTC 
z^GTGTCGTATCCGAGGATGCCTGTCATGCTGCCGGTGCCGTAGGTGATGGA GACTGTCTC 



790 



800 



I . 



I , 



810 



820 



83 0 



L 



840 



GCTGGTGGACTGGTAGGTGGAAGAATCCTCAGGGTTGAAGCGGTTGTGGTTGGTGCAGGC 
GCTGGTGGACTGGTAGGTGGAAGAATCCTCAGGGTTGAAGCGGTTGTGGTTGGTGCAGGC 
GCTGGTGGACTGGTAGGTGGAAGAATCCTCAGGGTTGAAGCGGTTGTGGTTGGTGCAGGC 
GCTGGTGGACTGGTAGGTGGAAGAATCCTCAGGGTTGAAGCGGTTGTGGTTGGTGCAGGC 



850 



860 



870 



880 



890 



900 
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N0V2a 
N0V2b 
N0V2C 
NOV2d 



841 
841 
841 
841 



^AGACTGGAGCAGTAGACTGAGGGCACCCACAGGTTGGAGGAGCCGGTGTCAAAGAfiGACi 
A^AGACTGGAGCAGTAGACTGAGGGCACCCACAGGTTGGAGGAGCCGGTGTCAAAGACGAC 
AAGAC^GGAGCAGTAGACTGAGGGCACCCACAGGTTGGAGGAGCCGGTGTCAAAGACGAC 
aar;arTr,nAnrAGTAGACTGAGGGCACCCACA GGTTGGAGGAGCCGGTGTCAAAGACGAC 



900 
900 
900 
900 



! 



910 

I 



I 



920 



930 



940 



I 



I . 



950 



960 



N0V2a 901 
NOV2b 901 
NOV2C 
NOV2d 



90X 
901 



GTG2\AATCCTGGGCAGGAGTTCCGATGCCGATAGTGCCGAAGTACTCCATATCCAGGT. 
GGTGAJ^TCCTGGGCAGGAGTTCCGATGCCGATAGTGCCGAAGTACTCCATATCCAGGTA 
GGTGAAATCCTGGGCAGGAGTTCCGATGCCGATAGTGCCGAAGTACTCCATATCCAGGTA 
GGTGAAATCCTGGGCAGGAGTTCCGATGCCGATAGTGCCGAAGTACTCCATATCCAGGTA 



960 
960 
960 
960 



I 



970 
I 



I . 



980 



i 



990 
. . 1 . . 



1000 



1010 



1020 



N0V2a 
NOV2b 
NOV2C 
N0V2d 



NOV2a 
NOV2b 
NOV2C 
lI0V2d 



N0V2a 
NOV2b 



961 
961 
961 
961 



1021 
1021 
1021 
1021 



1081 
1081 



STTCTCCAGGGGCTGTTCATCTACCAGGGTGGGAGCCTiSCCACTGGGGGAAGTACTTTCT 
GT-TCTCCAGGGGCTGTTCATCTACCAGGGTGGGAGCCTCCCACTGGGGGAAGTACTTTCT 
GTTCT^CCAGGGGCTGTTCATCTACCAGGGTGGGAGCCTCCCACTGGGGGAAGTACTTTCT 
-TTPTrriinnriGrTGTTCATCTACCAGGGTGGGAGCCTCCCACTGGGGGAAGTACTTTCT 



1020 
1020 
1020 
1020 



1030 
1 



\ 



1040 1050 
. t I j... 



1060 



1070 



GGCTGGGTTGAGGTTGTGCTTCTTCAGGAAGTCCTTCAGCAGGCCACGCTCGGACAGGGT 
GGCTGGGTTGAGGTTGTGCTTCTTCAGGAAGTCCTTCAGCAGGCCACGCTCGGACAGGGT 
GGCTGGGTTGAGGTTGTGCTTCTTCAGGAAGTCCTTCAGCAGGCCACGCTCGGACAGGGT 
GGCTGGGTTGAGGTTGTGCTTCTITAGGAAGTCCTTCAGCAGGCCACGCTCGGACAGGOT 



1080 

A 

1080 
1080 
1080 
1080 



1090 1100 

I _ . } 1 . _ 



1110 



1120 



1130 



HOV2C 1081 
NOV2d 1081 



GCGCCTCAAGGACTTCTTTCTGATGAGGGGGACCTTGTACATGATAAGCTT 
^GCGCCTCAAGGACTTCTTTCTGATGAGGGGGACCTTGTACATGATAAGCTT 

Igcgcctcaaggacttctttctgatgagggggaccttgtacatgataagctti 
Igcgcctcaaggacttctttctgatgagggggaccttgtacatgataagctt 



(SEQ ID NO: 5) 
(SEQ ID NO: 7) 
(SEQ ID NO: 9) 
(SEQ ID NO: 11) 



The proteins associated with NOV2a, NOV2b, NOV2c, and NOV2d are encoded in 
negative reading frames. An alignment of all NOV2 proteins is shown in Table 2E. 



NOV2a 
NOV2b 
NOV2C 
N0V2d 
N0V2 



Table 2E: NOV2 protein variants 

10 20 30 40 50 60 

1 iBj8a»i«siia.4ja«iA<yiia 28 

X mmWrWtt w ''^ 

X ffi^BMimim^^ 

1 ^'-^-^ ^ «^<^ oT>Tnpnp oMgMT .T.TT ■r:T.^raT.gT^r T'oiMt«Maii«aMacf ■K<r4iMH«agieiiiw-^^ 60 

70 80 90 100 110 120 

I . ! I I . . , , I I ! I ( I \_ 



MaV2a 


29 


I«0V2b 


29 


NOV2C 


29 


N0V2d 


29 


N0V2 


61 


N0V2a 


89 


N0V2b 


89 


N0V2C 


89 


HOV2d 


89 


NOV2 


121 


NOV2a 


149 


NOV2b 


149 


NOV2C 


149 



KNLNPARKYFPQVs^APTLVDEQPLENYLDKEYFGTIGIGTFAQDrT^lFDTGSSNLWVPS: 
KNLNPARKYFPQWHAPTLVDEQPLENYLDMSYFGTIGIGTPAQDFTWFDTGSSNLWVPS^ 
HNLNPARKYFPQW3APTLVDSQPLENYLDMSYFGTIGIGTPAQDFTWFDTGSSNLWVPS 
KNLNPARKYFPQWEAPTLVDEQPLEI^LDMEYFGTIGIGTPAQDFTWFDTGSSNLWVPS 

.nMFYFGTTGTGTPAODFT^TlFDTGSSNLWVPS 



88 
88 
88 
88 
120 



I 



130 



140 



150 



160 



170 



180 



} 



I 



v-i-CSSLACTNHNRFNPEDSSTYQSTSETVSITYGTGSMTGILGYDTVQVGGISDTNQIFG 
\rfCSSLACTNHNRFNPEDSSTYQSTSETVSITYGTGSMTGILGYDTVQVGGISDTNQIFG 
^/YCSSLACTNHNRFNPSDSSTYQSTSETVSITYGTGSMTGILGYDTVQVGGISDTNQIFG 
^.^CSSLACTNHNRFNPEDSSTYQSTSETVSITYGTGSMTGILGYDTVQVGGISDTNQIFG 

wcsslj^ctnhnrfnpedsstyqItsetvsitygtgsmtgilgybtvqvggisdtnqifg 



148 
148 
148 
148 
180 



190 
I . . 



1 . 



200 
. . } . . 



210 



220 



230 



240 



{ 



lsetepgsflyyapfdgilglaypsisssgatpvfdniwnqglvsqdlfsvylsaddqsg 
lsstepgsflyyapfdgilglaypsisssgatpvfdniwnqglvsqdlfsvylsaddqsg 
lsetepgsflyya?fdgilglaypsisssgatpvfdniwnqglvsqdlfsvylsadd|sg 



208 
208 
208 



23 



N0V2d 


149 




ILGLAYPSISSSGATPVFDNIWNQGLVSQDLFSVYLSADDQSG 


208 


HOV2 


181 




ILGLAYPSISSSGATPVFDNIWNQGLVSQDLFSVYLSADDQSG 


240 






250 


260 270 280 


290 300 
...|....{....| 


NOV2a 


209 


SWIFGGIDSSYYTGSLNXWPVTVEGYWQITVDSITMNGEAIACAE 


GCQAIVDTGTSLLT 


268 


NOV2b 


209 


SWIFGGIDSSYYTGSLNVJVPVTVEGYWQIT^/DSITMNGEAIACAE 


GCQAIVDTGTSLLT 


268 


NOV2C 


209 


S WIFGG IDS S YYTG SLlSn^P VTVEG YT\^Q ITVD S I Tr-INGEjJl ACAEGCQAI VDTGTSLLT 


268 


NOV2d 


209 


SWIFGGIDSSYYTGSLI^/T^PWVEGYVJQITVDSITMNGEAIACAEGCQAIVDTGTSLLT 


268 


N0V2 


241 


S WI FGG IDS S YYTGSLNWVPVT VEGY;^7Q IT\^ S ITMNGS AI ACAS 


GCQAIVDTGTSLLT 


300 






310 

...,|....|....|.. 


320 330 340 


350 360 

. . . }....|....| 


N0V2a 


269 


GPTSPIANIQSDIGASENSDGDMWSCSAISSLPDIVFTINGVQYPVPPSAyiLQSEGSC 


328 


N0V2b 


269 


GPTSPIANIQSDIGASE 


NSDGDMWSCSAISSLPDIVFTINGVQYPVPPSAYILQSEGSC 


328 


NOV2C 


269 


GPTSPIAIMIQSDIGASENSDGDMWSCSAISSLPDIVFTINGVQYPVPFSAYILQSEGSC 


328 


KOV2d 


269 


GPTSPIANIQSDIGASE 


KSDGDMWSCSAISSLPDIVFTINGVQYPVPPSAYILQSEGSC 


328 


N0V2 


301 


GPTSPIANIQSDIGASE 


NS DGDMW^S CS AI S SLPD I VFT INGVQ YP 


VPPSAYILQSEGSC 


360 






370 

. . . - K . - . ,1^ 


380 390 400 

..|....|....[....| .... 1 .... L 






NOV2a 


329 


ISGFQGJ-INLPTESGELW 


I LGDVF I RQYFTVFDRANNQVS LAPVAVD 


374 (SEQ ID NO: 6) 


NOV2b 


329 


I SGFQGMNL PTESGSLW 


ILGDVF IRQYFTVFDRANNQVSLAPVAVD 


374 (SEQ ID NO: 8) 


N0V2C 


329 


I sgfqgmn|?tesgslw 


I LGDVF IRQ YFTVFDRANNQVS LAPVAVD 


374 (SEQ ID NO: 10) 


N0V2d 


329 


ISGFQGMNLPTESGELW 


I LGDVF I RQYFTVFDRANNQVSLAP VAVD 


374 (SEQ ID NO: 12) 


HOV2 


361 


I SG fqgmnlptesgelw 


ILGDVF IRQYFTVFDRANNQVSLAPVAH! 


404 (SEQ ID N0:4) 















Homologies to any of the above NOV2 proteins will be shared by the other NOV2 
proteins insofar as they are homologous to each other as shown above. Any reference to 
NOV2 is assumed to refer to the NOV2 proteins in general, unless otherwise noted. 

NOV2 also has homology to the amino acid sequences shown in the BLASTP data 
listed in Table 2F. 



Table 2F. BLAST results for NOV2 


Gene Index/ 
Identifier 


Protein/ Oarganism 


Length 
(aa) 


Identity 
(%) 


Positives 
(%) 


Expect 


gi j 129792 jsp| POD 
790|PEPA_Ht)MAN 


Pepsin A precursor 


388 


385/388 
(99%) 


387/388 
(99%) 


0.0 


gi| 625423 |pir| [A 
30142 


pepsin A (EC 
3.4,23.1) 5 
precursor - human 


388 


384/388 
(98%) 


387/388 
(98%) 


0.0 


gi {387013 |crb|AAA 
60061. l| 

(M26032) 


pepsinogen A [Homo 
sapiens} 


388 


383/388 
(98%) 


386/388 
(98%) 


0.0 


gi| 625424 |pir| )B 
30142 


pepsin A (EC 
3.4.23.1) 4 
precursor - human 


388 


382/388 
(98%) 


386/388 
(99%) 


0.0 


gi [ 129780 | sp] P27 
677|PEP2_MACFU 


PEPSIN A- 2 /A- 3 
PRECURSOR (PEPSIN 
III-2/III-1) 


388 


367/388 
(94%) 


381/388 
(97%) 


0,0 



The homology of these sequences is shown graphically in the ClustalW analysis 
shown in Table 2G, 
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Table 2G. ClustalW Analysis for NOV2 



1) NOV2 

2) gi 1 129792 I 

3) gij 625423 1 

4) gij387013i 

5) gi I 625424 I 

6) gi 1129780 I 



(SEQ ID NO:4) 
(SEQ ID NO: 128) 
(SEQ ID NO: 129) 
(SEQ ID NO: 130) 
(SEQ ID NO: 131) 
(SEQ ID NO: 132) 



NOV2 

gi 1 129792 I 
gi I 625423 I 
gi I 387013 I 
gi j 625424 | 
gij 129780 I 



NOV2 

gi 1 129792 I 
gij 625423 [ 
gij 387013 I 
gi [6254241 
gij 129780 I 



61 
45 
45 
45 
45 
45 



gij 625423 i 
gi 1387013 I 
gi I 625424 I 
gij 129780 I 




70 



! . 



80 



90 



100 



110 



120 
.♦I 



HNLNPARKYFPQWEAPTLVDSQPLENYLDMEYFGTIGIGTPAQDFTVLFDTGSSNLWVPSl 
HNLNPARKYFPQWEAPTLVDEQPLENYLDMEYFGTIGIGTPAQDFTVlFDTGSSNLWPSi 

hnlnparkyfpq'^^jIaptlvdeqplenyldmeyfgtigigtpaqdftvxfdtgssnlwvps 
hnlnparkyfpqvv^aptl\t:eqplenyldmeyfgtigigtfaqdftvlfdtgssnlwps 
Iknlnparkyfpqweaptlvdeqplei^lbmeyfgtigigtpaqdftvlfdtgssnlwvps 

lHNl3N'PAgKYFPQgEAPTL§DBQPLENYLDMEYFGTIGIGTPAQDFTVlFDTGSSNLWVPS 



120 
104 
104 
104 
104 
104 



180 



NOV2 


121 


gi| 1297921 


105 


gij 625423 1 


105 


gi 1 387013 1 


105 


gij 625424 j 


105 


gij 1297801 


105 


NOV2 


181 




200 



210 



1 



I 



165 
165 
165 
165 
165 



^SETEPGSFLYYAPFDGILGLAYPSISSSGATPVFDNIWNQGLVSQDLFSVYLSADDQSG! 
LSETEPGSFLYYAPFDGILGLAYPSISSSGATPVFDNIWl^IQGLVSQDLFSVYLSADDQSG 
LSETEPGSFLYYAPFDGILGLAYPSISSSGATPVFDNIK^-QGLVSQDLFSVYLSADDQSG 
LSETSPGSFLYYAPFDGILGLAYPSISSSGATPVFDNII'JNgGLVSQDLFSVYLSADBQSG 
LSETEPGSFLYYAPFDGILGLAYPSISSSGATPVFDNIVJNQGLVSQDLFSVYLSADBfSG 
T.SETEPGSFLYYAPFDGILGLAYPSISSSGATPVFDNIWNQGLVSQDLFSVYLSADDQSG 



240 
224 
224 
224 
224 
224 



250 



. 1 , 



260 



270 



280 



I 



290 



300 



NOV2 


241 


gi 


129792] 


225 


gi 


6254231 


225 


gi 


387013 j 


225 


gi 


625424 j 


225 


gi 


1297801 


225 



SWIFGGIDSSYYTGSLNVJVPVTVEGYWQITVDSITMNGSAIACAEGCQAIVDTGTSLLT 
SWIFGGIDSSYYTGSLNVJVPVTVEGYI^^QITVDSITMNGSAIACAEGCQAIVDTGTSLLT 
SWIFGGIDSSYYTGSLISr/^VPVTVEGYWQITVDSITMNGSAIACASGCQAIVDTGTSLLT 
SWIFGGIDSSYYTGSLI.WPVTVEGYVJQITVDSITMNGEAIACAEGCQAIVDTGTSLLT 
SWIFGGIDSSYYTGSLNX-T/PVTVEGYWQITVBSITMNGEUIACAEGCQAIVDTGTSLLT 



300 
284 
284 
284 
284 
284 



310 



1 , 



320 
. . [ - . 



330 
. . I . . 



34 0 
. . I . . 



350 



360 
.-I 



NOV2 


301 


gi| 


129792] 


285 


gil 


625423] 


285 


gi| 


387013 


285 


gij 625424 


285 


gi| 


129780 


285 


NOV2 


361 


gi 


129792 


345 


gi 


625423 


34 5 


gi 


387013 


345 


gi 


625424 


345 


gi 


129780 


345 



GPTSPIANIQSDIGASSNSDGDMWSCSAISSLPDIVFTINGVQYPVPPSAYILQSEGSC 
GPTSPIANIQSDIGASENSDGDrWVSCSAISSLPDIVFTINGVQYPVPPSAYILQSEGSC 
GPTSPIAKIQSDIGASENSBGDMWSCSAISSLPDIVFTINGVQYPVPPSAYILQSEGSC 
GPTSPIANIQSBIGASENSDGBMWSCSAISSLPBIVFTINGVQYFVPPSAYILQSSGSC 
GPTSPIANIQSDIGASENSBGBMWSCSAISSLPBIVFTINGVQYPVPPSAYILQSEGSC 
GPTSPIANIQSBIGASSNSDGiMWSCSAISSLPBIVFTINGlQYPVPPSAYILQSiGSC 



360 
344 
344 
344 
344 
344 



370 



} , 



380 



390 

I I . 



400 



ISGFQGMNLPTESGELWILGBVFIRQYFTVFlRANNQVfLAPVA 
ISGFQGm^LPTESGELWILGDVFIRQYFTVFiRANNQVGLAPVA 
ISGFQGr/INLPTESGELWILGDVFIRQYFTVF|RANNQVGLAPVA 
I S GFQGr^xNLP TE SGELWILGBVF I RQYFTVF|RANNQVGL AP VA 
ISGFQG?^In1pTSSGELWILGDVFIRQYFTVF!|RANNQVGLA?VA 
I SGFQG^MP TE SGSLWI LGDVF I RQYFTVf|rANNQVGLAPV^ 



404 
388 
388 

388 
388 
388 
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Table 2H lists the domain description from DOMAIN analysis results against NOV2. 
This indicates that the NOV2 sequence has properties similar to those of other proteins 
known to contain these domains. 



Table 2H. Domain Analysis of NOV2 

gnl|Pfaralpfam00026, asp, Eukaryotic aspartyl protease. Aspartyl (acid) 
proteases include pepsins, cathepsins, and renins. Two-domain structure, 
probably arising from ancestral duplication. This family does not xnclude the 
retroviral nor retrotransposon proteases (pf am00077) , which are much smaller 
and appear to be homologous to a single domain of the eukaryotic asp 
proteases • 

CD-Length = 376 residues, 99.5% aligned Score = 462 bits 



1189) , 


Expect = 2e-131 




NOV 2: 
Sbjct: 


35 
3 


KVPLIRKKSIJiRTLSERGLLKDFLKKHNmPARKYFPQWEAPTLVDEQPIJS^^ 

++1I + III llhhl III I 1 '^i ^ IIMII! ll + l 

RIPIiKKVPSIJ^KLSEKGVI*LDFLVKRKyEPTKKL,TGGASSSRSAVE-PLIJT5n^Di^ 


94 
61 


NOV 2: 


95 


TIGIGTPAQDFTVLFDTGSSNLWVPSVYCSSL-ACTNHNRFNPEDSSTYQATSETVSITY 


153 


Sb j ct : 


62 


II fill 1 tihiiitihiiimiht II 1 1^1 iiii^ 1 ii^i 

TISIGTPPQKFTWFDTGSSDLWVPSVYCTSSYACKGHGTFDPSKSSTYKNLGTTFSISY 


121 


NOV 2: 


154 


GTGS-MTGILGYDTVQVGGISDTNQIFGLSETEPGSFLYYAPFDGILGIAYPSISSSGA- 


211 


Sbjct: 


122 


1 II H II 111 Mil- III III- Mill t MMIM -M! - 1 

GDGSSASGFLGQDTVTVGGITVTNQQFGLATKEPGSFFATAVFDGILGLGFPSIEAGGPY 


181 


NOV 2: 


212 


TPVFDNIWNQGLVSQDIiFSVYLSADDQSGSVVIFGGIDSSyYTGSLNWVPVTVEGYWQIT 

IMIIK ^111- IIMl+^l ^1 ^Mlhl 1 mil Mill ^tlMII 

TPVFDNLKSQGLIDSPAFSVYLNSDSGAGGEIIFGGVDPSKYTGSLTWVPVTSQGYWQIT 


271 


Sbjct: 


182 


241 


NOV 2: 


272 


VDSITMNGEAIACAEGCQAIVDTGTSLLTGPTSPIANIQSDIGASENSD-GDMWSCSAI 


330 


Sbjct: 


242 


+ 1111- t h tllM-MllIM MM - 1 ^111 - 1+ i+ 1 -^1 

LDSITVGGSTTFCSSGCQAILDTGTSLLYGPTSIVSKIAKAVGASLSEYSGEYVIDCDSI 


301 


NOV 2: 


331 


SSLPDIVFTINGVQYPVPPSAYILQSEGS CISGFQG^4NLPTESGELWILGDVFIRQ 


386 


Sbjct: 


302 


MUM lit- MIMI-M hlMI i MIIMM-I 

SSLPDITFFIGGAKITVPPSAYVLQPSSGGSDICLSGFQSDDIPG--GPLWILGDVFLRS 


359 


NOV 2: 


387 


YFTVFDRANNQVSDAPV 4 03 (SEQ ID NO: 133) 




Sbjct: 


360 


- MM Ih- Ml 

AYWFDRDNNRIGLAPA 376 (SEQ ID NO: 134) 





Pepsin is one of the main proteolytic enzymes secreted by the gastric mucosa. It 
consists of a single polypeptide chain and arises from its precursor, pepsinogen, by removal 
of a 41 -amino acid segment from the amino end. Pepsin is particularly effective in cleaving 
1 0 peptide bonds involving aromatic amino acids. Samloff and Townes (1 970) showed that the 
pepsinogen-5 derived from the stomach and excreted in the urine is absent in some persons. 
Family and population data supported the view that absence of PG-5 is recessive, i.e., persons 
with the PG-5 band on electrophoresis are either homozygous or heterozygous for a particular 
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allele. SamloSet al (1973) found no instance of absent PG-5 among Japanese, Chinese and 
Filipinos. Among American whites and blacks a frequency of 14% was found. Data, 
suggestive but not conclusive, of linkage of Kell (1 10900) and pepsinogen were reported by 
Weitkamp et al (1975), Data of Gedde-Dahl et al (1978) cast doubt on the linkage of PG 
and HLA. Whittington et al (1980) excluded linkage of PG with either HLA or glyoxalase L 
Korsnes et al (1980) found no clear evidence of linkage between PG5 and 28 marker loci. 
Linkage below 25% recombination for HLA and GPT was ruled out. Linkage below 20% 
recombination was ruled out for Rh, PGM-1, and several others. The possibility of loose 
linkages included Pg5-C6 and Pg5-MNSs. In the mouse, Szymura and Klein (1981) found 
linkage of urinary pepsinogen with the major histocompatibility complex. Arguing from 
homology, one might take this as suggestive evidence that a pepsinogen gene is on 
chromosome 6. See duodenal ulcer, hyperpepsinogenemic I (126850). 

Sogawa et al (1983) isolated a recombinant clone for the human pepsinogen gene by 
screening the Maniatis library of human genomic DNA with a swine pepsinogen cDNA as a 
probe. They concluded that the pepsinogen gene occupies about 9.4 kb pairs of genomic 
DNA and is separated into 9 exons by 8 introns of variable lengths. The predicted amino acid 
sequence of human pepsinogen consists of 373 residues and is 82% homologous with that of 
swine pepsinogen. The predicted sequence contains 15 amino acid residues at the NH2 end, 
showing that the protein is synthesized as a prepepsinogen. In human gastric mucosa, 2 
immunologically distinct classes of pepsinogen are synthesized. PGl is restricted to the 
corpus, while PG2 is found throughout the stomach as well as in the proximal duodenum. 
PGl is found in serum and urine in a ratio of about 1 to 10. PG2 is present in serum and 
seminal fluid but only trace amounts are found in urine. Serum PGl and PG2 apparently 
originate from the stomach in the main, because the levels are very low after gastrectomy. 
PG2 in seminal fluid probably originates from tiie prostate. Frants et al (1984) proposed a 
new genetic model to e3q)lain the inheritance of tihie urinary pepsinogen (PGl) polymorphism. 
They proposed that each main fraction— 3, 4, and 5— in the multibanded electrophoretic 
pattern is determined by its own specific gene, B, C and D, respectively. The relative 
intensities of the fractions are determined by gene copy numbers. According to this model 
the PGl system is inherited as autosomal codominant haplotypes. Some critical families not 
explained by previous models were presented in support of the hypothesis. In a note added in 
proof, the authors reported the resolution of a workshop to use PGA and PGC in place of PGl 
and PG2, respectively. In man, there are 2 related pepsinogen systems: PGA, formerly PG I, 
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precursor of pepsin A (EC 3.4.23.1 ), and PGC, formerly PG II, precursor of pepsin C (EC 
3.4.23.3). 

Excq)t for the autosomal inheritance of the PGA polymorphism, no definite data on 
the chromosomal localization of these genes were available until the mapping of pepsinogen 
A to chromosome 1 1 (Frants et al, 1985; Taggart et al., 1985). The polymorphism of PGA is 
due to variation in the number of genes in the centromere region of chromosome 1 1 . Taggart 
et al (1985) proposed that the PG I isozymogens, Pg3, Pg4, and Pg5, are encoded by closely 
linked genes, PGA3 (169710), PGA4 (169720), and PGA5 (169730), and that fteir presence 
or absence in different haplotype combinations determines phenotypic variation of PG I. 
Taggart et al (1985) used a pepsinogen cDNA probe with man-rodent somatic cell hybrids to 
show that the complex is on chromosome 1 L By means of 3 different X;l 1 translocations, 
they narrowed the assignment to 1 lpl2-l lql3. Frants et al (1985) likewise mapped PGA to 
chromosome 1 1 (1 Ipter-l lql2). Nakai et al (1986) assigned the pepsinogen genes to 1 lql3 
by in situ hybridization. Kidd (1986) found that the pepsinogen cluster is about 20 cM on the 
centromeric side of the CAT locus (1 15500). Hayano et al (1986) obtained a cosmid clone 
containing 2 PGA genes in a single insert. Restriction endonuclease mapping showed that 
the two have very similar but distinct structures and that they are closely linked. The close 
situation of genes of very similar structure probably facilitates unequal crossing-over, which 
accounts for a high frequency of haplotype variation in copy number of PGA genes (Taggart 
et al, 1985). Taggart et al (1987) analyzed by Southern blot analysis of DNA from somatic 
cell hybrids the 3 most common PGA haplotypes and demonstrated the presence of 3 genes 
in the PGA-A haplotype (PGA3, PGA4, and PGA5); 2 genes in the B haplotype (PGA3 and 
PGA4); and 1 gene in the C haplotype (PGA4). This unusual polymorphism of genomic 
DNA encoding very similar proteins probably reflects recent evolution by gene duplication. 
Kishi and Yasuda (1987) identified a *new* polymorphism. Evers et al (1987) contributed to 
the understanding of the molecular basis for the heterogeneity of the PGA isozymogen 
pattern by studies at the DNA level in a pair of pepsinogen genes. They demonstrated a 
single nucleotide difference giving rise to a glu-to-lys substitution of the 43rd amino acid 
residue of the activation peptide, leading to a charge difference of the corresponding 
isozymogens. The substitution was in 1 of 2 tandem genes. Zelle et al (1988) amplified on 
the hypothesis that the heterogeneity in pepsinogen A resides in the existence of a variable 
number of copies of PGA genes and different combinations of these genes. From restriction 
enzyme analysis of the cluster, they developed hypotheses for the creation of the variety of 
haplotypes through unequal but homologous crossing over. In the PGA gene quadruplet, for 
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example, 4 genes are arranged in a highly ordered fashion in a head-to-tail orientation. Using 
the length in kilobases of the large polymorphic EcoRI fragment of the PGA genes, this 
quadruplet could be described as 15.0— 12,0~12.0— 16.6. 

See, for example, Evers, et al. Hum. Genet. 77: 1 82-1 87, 1987. PubMed ID : 
31 15885; Frants, et al. Hum. Genet 65: 385-390, 1984. PubMed ID : 6693125; Frants, et al, 
Cytogenet. Cell Genet. 40: 632 only, 1985; Gedde-Dahl, et al, Cytogenet. Cell Genet. 22: 
301-303, 1978. PubMed ID : 752491; Hayano, et al, Biochem. Biophys. Res. Commun. 138: 
289-296, 1986. PubMed ID : 3017318; Korsnes, et ah L.; Ann. Hum. Genet. 44: 185-194, 
1980. PubMed ID : 7316469; Nakai, et al, Cytogenet. Cell Genet. 43: 215-217, 1986. 
PubMed ID : 3467902; Samloff, et al. Am. J. Hum. Genet. 25: 178-180, 1973. PubMed ID : 
4689038; Sogawa, et al, J. Biol. Chem. 258: 5306-531 1, 1983. PubMed ID : 6300126; 
Szymura and Klein, Immunogenetics 13: 267-271, 1981. PubMed ID : 7275224; Taggart, et 
oL, Somat. Cell Molec. Genet. 13: 167-172, 1987. PubMed ID : 3031 827; Taggart, et aL, 
Proc. Nat. Acad. Sci. 82: 6240-6244, 1985. PubMed ID : 3862130; Weitkamp, et aL, 
Cytogenet. Cell Genet. 14: 451-452, 1975; Weitkamp, et al. Am. J. Hum. Genet. 27: 486- 
491, 1975. PubMed ID : 1 155457; Whittington, et al, Cytogenet. Cell Genet. 28: 145-150, 
1980. PubMed ID : 7438789; and Zelle, et al. Hum. Genet. 78: 79-82, 1988. PubMed ID : 
2892778. 

The protein similarity information, expression pattern, cellular localization, and map 
location for the NOV2 proteins and nucleic acids disclosed herein suggest that these Pepsin A 
Precursor-like proteins may have important structural and/or physiological functions 
characteristic of the Pepsin A Precursor family. Therefore, the nucleic acids and proteins of 
the invention are useful in potential diagnostic and therapeutic applications and as a research 
tool. These include serving as a specific or selective nucleic acid or protein diagnostic and/or 
prognostic marker, wherein the presence or amount of the nucleic acid or the protein are to be 
assessed. These also include potential therapeutic applications such as the following: (i) a 
protein therapeutic, (ii) a small molecule drug target, (iii) an antibody target (therapeutic, 
diagnostic, drug targeting/cytotoxic antibody), (iv) a nucleic acid useful in gene therapy (gene 
delivery/gene ablation), (v) an agent promoting tissue regeneration in vitro and in vivo, and 
(vi) a biological defense weapon. 

The novel nucleic acids and proteins of the invention have applications in the 
diagnosis and/or treatment of various diseases and disorders. For example, the compositions 
of the present invention will have efficacy for the treatment of patients suffering from: 
hypercalceimia, ulcers, cancer, as well as other diseases, disorders and conditions. 
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The novel NOV2 nucleic acids encoding the Pepsin A Precursor-like proteins of the 
invention, or fragments thereof, are useful in diagnostic applications, wherein the presence or 
amount of the nucleic acid or the protein are to be assessed. These materials are further 
useful in the generation of antibodies that bind immunospecifically to the novel substances of 
the invention for use in therapeutic or diagnostic methods. These antibodies may be 
generated according to methods known in the art, using prediction from hydrophobicity 
charts, as described in the "Anti-NOVX Antibodies" section below. The disclosed NOV2 
protein has multiple hydrophilic regions, each of which can be used as an immunogen. In 
one embodiment, a contemplated NOV2 epitope is from about amino acids 2 to 4. In another 
embodiment, a contemplated NOV2 epitope is from about amino acids 40 to 70. In 
alternative embodiments, contemplated NOV2 epitopes include from about amino acids 140 
to 145, 160 to 163, 210 to 215, 240 to 245, 290 to 305, 340 to 342, 350 to 353 and 380 to 
385. 

NOV3 

A disclosed NOV3 nucleic acid (designated as CuraGen Acc. No. CG56936-01), 
which encodes a novel Ribonuclease Pancreatic-like protein and includes the 479 nucleotide 
sequence (SEQ ID NO: 13) shown in Table 3 A. An open reading frame for the mature 
protein was identified beginning with an GGC codon at nucleotides 13-15 and ending with a 
TAG codon at nucleotides 474-476. Putative untranslated regions downstream from the 
termination codon and upstream from the initiation codon are underlined in Table 3A, and the 
start and stop codons are in bold letters. 

Table 3A. NOV3 Nucleotide Sequence (SEQ ID NO:13) 

AGGAAACTATCTG GCCTCAAGTCATCACAAGTGACAAGAACAAACCCCTCTGTGGGGGAATAGTGGTACCTGCAG 
GCAGGGTATCTTGTGCCTTCAATGAGCTGACAGACTGT(^TTTTGAACTTTGTCTCACTCTGAAA6CAGAAAATC 
GCCGAAAGGTTTTGGCAAGCAACCTTCTTGGGAGAAATGCAAATACCATTGATTTTTCGAGGCCTCTCATGGATG 
AAGACATGCTCCTTTTTACAAGTGTGGTCAGGTTCCCTGATAACTCTTTGTATGATCATGTGGTTGCAGTACCTT 
GCAGGAACGGGAACGTCATTCTGAGGGTAGTCCACATGCAAGTGTTCTAAAGTTGACATCACTGCTTCATCATTC 
ACCTCaVTTTTCCCAGAACAGAAGCACCAAGAAAATTATCACCATTGCCATTGAGAGAAGAGATCTCAGACTCXSGG 
AGCTGATCTTGAGTTATTTAACATAGCCA 

The nucleic acid sequence of NOV3 maps to chromosome 14 and has no similarity on 
the DNA level to any known sequence. 

A disclosed NOV3 polypeptide (SEQ ID NO:14) is 141 amino acid residues in length 
and is presented using the one-letter amino acid code in Table 3B. The SignalP, Psort and/or 
Hydropathy results predict that NOV3 has a signal peptide and is likely to be localized to the 
endoplasmic reticulum (membrane) with a certainty of 0.5500. In alternative embodiments, a 
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NOV3 polypeptide is located to the lysosome (lumen) with a certainty of 0.1900, the 
endoplasmic reticulum (lumen) with a certainty of 0.1000, or the outside of the cell with a 
certainty of 0.1000. The Signal? predicts a likely cleavage site for aNOV3 peptide between 
amino acid positions 19 and 20, Le, at the dash in tiie sequence VND-EA. 



Table 3B. Encoded NOV3 Protein Sequence (SEQ ID NO:14) 



MAMVIIFLVLLFWENEVNDEAVMSTLEHIJIVDYPQNDVPVPARYC^ 

NGXCISPKKVACQNLSAIFCFQSETKFKMTVCQLIEGTRYPACRYHYSPTEGFVLVTCDDLRPDSF 



The NOV3 amino acid sequence was found to have 39 of 134 amino acid residues 
(29%) identical to, and 69 of 134 amino acid residues (51%) similar to, the 156 amino acid 
residue ptnr:SWISSNEW-ACC:P07998 protein from Homo sapiens (Human) 
(RIBONUCLEASE PANCREATIC PRECURSOR (EC 3.1.27.5) (RNASE 1) (RNASE A) 
(KNASE UPM) (RIB-1)) (E = 1.3e ^^). 

NOV3 is expressed in at least the following tissues: pancreas, lung, testis, and b-cell. 
Expression information was derived from the tissue sources of the sequences that were 
included in the derivation of the sequence of CuraGen Acc. No. CG56936-01 . 

Possible small nucleotide polymorphisms (SNPs) found for NOV3 are listed in Table 

3C. 



Table 3C: SNPs 


Variant 


Nucleotide 
Position 


Base Change 


Amino Acid 
Position 


Base Change 


13376210 


117 


T>C 


NA 


NA 


13376983 


164 


C>T 


55 


Pro>Leu 


13376211 


205 


A>G 


69 


Arg>Gly 


13376985 


338 


A>G 


113 


Tyr>Cys 


13376986 


354 


OT 


NA 


NA 


13376987 


371 


A>G 


124 


Glu^Gly 



NOV3 has homology to the amino acid sequences shown in the BLASTP data listed 
in Table 3D. 



Table 3D. BLAST results for NOV3 


Gene Index/ 
Identifier 


Protein/ Organism 


Length 
(aa) 


Identity 
(%) 


Positives 
{%) 


Expect 
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gi 1 12853968 [dbj 
jBAB29898-l| 


Pancreatic 
ribonucl eases 

containing 
protein: Pfam, 
source 

evidence ; 
ISS~putative [Mus 


208 


37/107 
(34%) 


59/107 
(54%) 


6e-09 


gi|l312449l|sp| 
Q9QyX3 1 RNP_MUSP 

j\ 


Ribonuclease 
patficreatic 

x3 V-> ^-4. ^ i^^^J-im \ iVLS C*- 

1) (RNase A) 


149 


37/130 
(28%) 


66/130 
(50%) 


le-08 


gi|l3399882|pdb 

|idza|a 


Chain A, 3-D 
Structure Of A 
Hp-Rnase 


129 


35/115 
(30%) 


58/115 
(50%) 


le-08 


gi 1 133226 |sp| PI 
9644|RNP_PREEN 


RIBOlilUCLEASE 
PANCREATIC (RNASE 
1) (RNASE A> 


128 


31/91 
(34%) 


51/91 
(55%) 


le-08 


gi 1 464659 jsp|P8 
0287|RNP_IGOIG 


RIBONUCLEASE 
PANCREATIC (RNASE 
1) (RNASE A) 


119 


32/118 
(27%) 


58/118 
(49%) 


le-08 



The homology of these sequences is shown graphically in the ClustalW analysis 
shown in Table 3E. 



Table 3E. ClustalW for NOV3 



1) 


NOV3 


(SEQ ID NO: 14) 


2) 


gi 


12853968 | 


(SEQ 


ID NO: 135) 


3) 


gi 


13124491 1 


(SEQ 


ID NO:136) 


4) 


gi 


13399882 1 


(SEQ 


ID NO:137) 


5) 


gi 


133226| 


(SEQ 


ID NO:138) 


6) 


gi 


464659 ( 


(SEQ 


ID NO:139) 



NOV3 

gxll2853968| 
gi 1 13124491 1 
gi I 13399882 [ 
gi 1 133226 I 
gi [464659 I 



NOV3 

gi 1 12853968 1 
gi 1 13124491 j 
gi|l3399882j 
gi 1 133226 I 
gi [464659 1 



NOV3 

gi 1 12853968 I 
gi 1 13124491 1 
gi 1133998821 
gi 11332261 
gl [4646591 



-YPC^VPVPA 42 

[efwpsdsqHeegegiwtte 60 

--SSC^Sl^P 4 8 

k-SGNlPsSs 24 

k-SGsjpsSs 23 

h-yPElSJ^PN 22 




190 



200 



210 



32 



220 



NOV3 

gi|l2853968| 
gi 1 13124491 1 
gi|l3399882| 
gi jl33226| 
gi 1464659 [ 




Table 3F lists the domain description from DOMAIN analysis results against NOV3, 
This indicates that the NOV3 sequence has properties similar to those of other proteins 
known to contain these domains. 



Table Domain Analysis of NOV3 

gnl I Smart I smart 000 92, RNAse_Pc, Pancreatic ribonuclease 
CD-Length - 123 residues, 80.5% aligned 
Score = 68.2 bits (165), Expect = 3e-13 

NOV 3- 30 HVDYPQmVPVPARYCNHMIIQRVII^PDHTCKKEHVFIHERPRKZNGICISPKKVACQN 89 

hi + 111 1+ +1 + + II + hll +-^1111 l + l 

Sbjct: 12 HIDSTPS~-SASDNYCNQMMKRI^TQ--GRCKPVNTFVHESrADVKAVC-SQKNVTCKW 66 

NOV 3- 90 LSAIFCFQSETKFKMTVCQLIEGTRYPACRYHYSPTEGFVIiVTCD 134 {SEQ ID NO; 14 0) 

I II kl l++lt 111 + ^+1 K 

Sbjct: 67 -GRTNCHQSNSRFQLTDCRLTGGSKYPNCRYKTTQANKHIIVACE 110 (SEQ ID NO: 141) 



gnl I Pf am |pfam00074 , maseA, Pancreatic ribonuclease. Ribonucleases. Members 
include pancreatic RNAase A and angiogenins. Structure is an alpha+beta fold - 
long curved beta sheet cuad three helices. 
CD-I*ength = 122 residues, 73.0% aligned 
Score = 64.3 bits (155), Expect = 4e-12 

NOV 3: 42 ARyCNHMIIQRVIREPDHTCKKEHVFIHERPRKINGICISPKKVACQNLSAIFCFQSETK 101 

III 1+ +1 + + II + l + ll + +1 t i I kl kll + 

Sbjct: 22 DNYCNQMMKRRNMTQG--RCKPVNTFVHESIJU>VKAVC-SQKNVTCKNGQKN-Cr^Q 77 

NOV 3: 102 FKMTVCQLIEGTRYPACRYHYSPTEGFVLVTCD 134 (SEQ ID NO: 142) 

1 + 1 k+ll Ml +1 1 + 
Sbjct; 78 FQI^TDCKLTGGSKYPNCRYRTTPGNKRIIVACE 110 (SEQ ID NO: 143) 



Pancreatic ribonuclease (EC 3.1.27.5 ) is one of the digestive enzymes secreted in 
abundance by the pancreas. Elliott et al (Cytogenet. Cell Genet. 42: 110-1 12, 1986) mapped 
the mouse gene to chromosome 14 by Southern blot analysis of genomic DNA from 
recombinant inbred strains of mice, using a probe isolated from a pancreatic cDNA library 
with the rat cDNA. The assignment to mouse 14 and the close linkage to the other 2 loci was 
confirmed by study of one of Snell's congenic strains: the 3 loci went together. Elliott et al 
(Cytogenet. Cell Genet. 42: 1 10-1 12, 1986) predicted that the homologous human gene RIBl 
is on chromosome 14. 

Human pancreatic RNase is monomeric and is devoid of any biologic activity other 
than its RNA degrading ability. Piccoli et al (Proc. Nat. Acad. Sci. 96: IK^Z-lll'i, 1999) 
engineered the monomeric form into a dimeric protein with cytotoxic action on mouse and 
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human tumor cells, but lacking any appreciable toxicity on human and mouse normal cells. 
The dimeric variant of human pancreatic RNase selectively sensitized cells derived from a 
human thyroid tumor to apoptotic death. Because of its selectivity for tumor cells, and 
because of its human origin, this protein was thought to represent an attractive tool for 
anticancer therapy. 

The protein similarity information, expression pattern, cellular localization, and map 
location for the NOV3 protein and nucleic acid disclosed herein suggest that this ribonuclease 
pancreatic-like protein may have important structural and/or physiological functions 
characteristic of the Ribonuclease Pancreatic family. Therefore, the nucleic acids and 
proteins of the invention are useful in potential diagnostic and therapeutic applications and as 
a research tool. These include serving as a specific or selective nucleic acid or protein 
diagnostic and/or prognostic marker, wherein the presence or amount of the nucleic acid or 
the protein are to be assessed. These also include potential therapeutic applications such as 
the following: (i) a protein therapeutic, (ii) a small molecule drug target, (iii) an antibody 
target (therapeutic, diagnostic, drug targeting/cytotoxic antibody), (iv) a nucleic acid useful in 
gene therapy (gene delivery/gene ablation), (v) an agent promoting tissue regeneration in 
vitro and in vivo^ and (vi) a biological defense weapon. 

The nucleic acids and proteins of the invention have applications in the diagnosis 
and/or treatment of various diseases and disorders. For example, the compositions of the 
present invention will have efficacy for the treatment of patients suffering from cancer as 
well as other diseases, disorders and conditions. 

The novel nucleic acid encoding the Ribonuclease Pancreatic-like protein of the 
invention, or fragments thereof, are useful in diagnostic applications, wherein the presence or 
amount of the nucleic acid or the protein are to be assessed. These materials are further 
useful in the generation of antibodies that bind immunospecifically to the novel substances of 
the invention for use in therapeutic or diagnostic methods. These antibodies may be 
generated according to methods known in the art, using prediction from hydrophobicity 
charts, as described in the "Anti-NOVX Antibodies" section below. The disclosed NOV3 
protein has multiple hydrophilic regions, each of which can be used as an immunogen. In 
one embodiment, a contemplated NOV3 epitope is from about amino acids 20 to 30. In 
another embodiment, a contemplated NOV3 epitope is from about amino acids 35 to 42. In 
other specific embodiments, contemplated NOV3 epitopes are from about amino acids 52 to 
55, 60 to 70, 70 to 72, no to 1 1 5, 1 18 to 124 and 130 to 135. 
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NOV4 and NOV5 

This invention includes two novel Ser/Thr kinase-like proteins. The disclosed 
proteins have been named NOV4 and NOV5. 



NOV4 

A disclosed NOV4 nucleic acid (designated as CG5 1707-02), encodes a novel Ser/Thr 
Kinase-like protein and includes the 1037 nucleotide sequence (SEQ ID NO: 15) shown in 
Table 4A. An open reading frame for the mature protein was identified beginning with an 
ATG codon at nucleotides 41-43 and ending with a TGA codon at nucleotides 1019-1021. 
Putative untranslated regions downstream from the termination codon and upstream from the 
initiation codon are underlined in Table 4A, and the start and stop codons are in bold letters. 



Table 4A^ NOV4 Nucleotide Sequence (SEQ ID NO:15) 

GCGCCGCGTQGGGGACGGAAGTGAAACTCTAAGi^TGAGA TGGAGAAGTACGAGCGGATCCGAGTGGTGG^ 

GGTGCCTTCGGGATTGTGmCCTGTGCCTGCGAAAGGCTGACCAGAAGCTGGTGATCATCAAGCAGA^ 

AACAGATGACCAAGGAAGAGCGGCAGGCAGCCCAGAATGAGTGCCAGGTCCrCAAGCTGCTC^ 

CATTGACTACTACGAGAACTTCCTGGAAGACAAAGCCClTATGACXrGCCATGGAATA 

GCTGAGTTCATCCAAAAGOXrTGTAATTCCCTGCrGGAGGAGGAGACCATCCT^ 

TTGCACTGCATCATGTGCACACCCACCTCATCCTGCACCGAGACCTCT^GACCCAGAAC^ 

CCGCT^TGGTCGTCAAGATCGGTGATTTCGGCATCTCCAAGATCCTTAGCAGCAAGAGCAAGGCCTACACGGTG^ 

GGTACCCCATGCTATATCTCCCCTGAGCTGTGTGAGGGCAAGCCCTACa^CCAGT^GAGTGACATCTG^ 

GCTGTGTCCTCTACGAGCTG6CCAGCCTCAAGAGGGCTTTCGAGGCTGCGAACTTGCCAGC3VCTGGTGCT 

CATGAGTGGCACCTTTGCACCTATCTCTGACCGCTACAGCCCTGAGCTTC^ 

CTGGAGCCTGCCCAGCGGCCACCACTCAGCCACATCATGGCACyVGCCCCTCTGCATCCGTGCCC^ 

ACACCGACGTGGGCAGTGTCCGCATGCGGAGGCCTGTGCAGGGACAGCXSAGCGGTCCTGGGCGGCAG^ 

ACCCAGTGGGAGCACACTTTCGCCTCIXSACTGTGTCCGCCaiCAGCCTGCACCTACACTCTGTC^ 

GACACCrrTGCACCATGATCTGAAAACACAATGACTTAGTCATCTGCCA^ 



The nucleic acid sequence of NOV4 maps to chromosome 17 has 463 of 759 bases 
(61%) identical to a gb:GENBANK-ID:AF087909|acc:AF087909.1 mRNA from Homo 
sapiens (Homo sapiens NIMA-related kinase 6 (NEK6) mRNA, complete cds) (E = L9e"^^). 

The NOV4 polypeptide (SEQ ID NO: 16) is 326 amino acid residues in length and is 
presented using the one-letter amino acid code in Table 4B, The SignalP, Psort and/or 
Hydropathy resuhs predict that NOV4 does not have a signal peptide and is likely to be 
localized to the cytoplasm with a certainty of 0.6500. In alternative embodiments, a NOV4 
polypeptide is located to the lysosome (lumen) with a certainty of 0.1 866 or the 
mitochondrial matrix space with a certainty of 0.1000. 



Table 4B. Encoded NOV4 Protein Sequence (SEQ ID NO:16) 

MEKYERIRWGRGAFGIVHIiCI.RKADQKLVIIKQIPVEQMTKEERQAAQNECQVLia.LM 

AIJ^AMEYAPGGTLAEFIQKRQTSIiIjEESTILHFFVQIIiIAimCVHTHL I LHRDLKTQNI LLDKHRMWKIGDF 
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GISKILSSKSKAYTWGTPCYISPELCEGKPYNQKSDIWALGCVLYEIASLKRAFEAANLPAIA7L 
ISDRYSPELRQLVLSLLSLEPAQRPPLSHIMAQPnCIRALLNLHTDVGSVra^imPVQGQRAVI^R 
LSPLTVSATACTYTLSSFTIDTLHHDLKTQ ^ 

The NOV4 amino acid sequence was found to have 152 of 333 amino acid residues 
(45%) identical to, and 21 8 of 333 amino acid residues (65%) similar to, the 357 amino acid 
residue ptnr:SPTREMBL-ACC:O01775 protein from Caenorhabditis elegans (SIMILARITY 
TO THE CDC2/CDX SUBFAMILY OF SERyTHR PROTEIN KINASES) (E = 1.6q^% 

NOV4 is expressed in at least the following tissues: fetal lung, other developmental 
tissues, germ cells and sex tissues. Expression information was derived from the tissue 
sources of the sequences that were included in the derivation of the sequence of NOV4. 

Possible small nucleotide polymorphisms (SNPs) found for NOV4 are listed in Table 

4C. 



Table 4C: SNPs 


Variant 


Nucleotide 
Position 


Base Change 


Amino Acid 
Position 


Base Change 


13376988 


105 


T>G 


22 


LeiP^Arg 


13376989 


204 


T>C 


55 


Leu>Pro 


13376990 


368 


G>A 


110 


Val>Met 


13376991 


712 


T>C 


NA 


NA 



NOV4 also has homology to the amino acid sequences shown in the BLASTP data 
listed in Table 4D. 



Table 4D. BLAST results for NOV4 




Gene Index/ 
Identifier 


Protein/ Organism 


Itength 
(aa) 


Identity 
(%) 


Positives 
(%) 


Expect 


gi|l5825377|gb| 
AAL09675.l|AF40 
7579_1 
{AF407579) 


NIMA- related 
kinase 8 [Mus 
mus cuius] 


698 


273/276 
(98%) 


275/276 
(98%) 


e-157 


gi| 12852471 tdbj 
|BAB2 9424.l| 
{AK014546) 


data source :SPTR, 
source key:P51954, 
evidence : ISS~putat 

ive-'Similar to 
SERINE/THREONINE- 

PROTEIN KINASE 
NEKl {EC 2.7.1.-) 

(NIMA-RELATED 
PROTEIN KINASE 1) 
[Mus musculus] 


291 


275/280 
(98%) 


277/280 
(98%) 


e-155 


gi| 15825379 |gb| 
AAL09676.l|AF40 
7580_1 
(AF407580) 


NIMA- related 
kinase 8 [Danio 
reriol 


697 


242/323 
<74%> 


276/323 
(84%) 


e-138 
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gi| 17511015|ref 
|NP_491914.l| 
(NM_059513) 


ser/thr-protein 

kinase 
[Caenorhabditis 
elegans) 


357 


148/335 
(44%) 


212/335 
(63%) 


3e-71 


gi|730l213 |gb|A 
AF56344.1| 
(AE003749) 


CGI 0951 gene 
product 
[Drosopiiila 
melanogaster] 


841 


125/265 
(47%) 


177/265 
(66%) 


2e-64 



The homology of these sequences is shown graphically in the ClustalW analysis 
shown in Table 4E. 



Table 4E. ClustalW Analysis for NOV4 



1) NOV4 

2) gi 1 15825377 I 

3) giil285247l| 

4) giil5825379| 

5) gill7511015j 

6) gi [7301213 I 



(SEQ 
(SEQ 
(SEQ 
(SEQ 
(SEQ 
(SEQ 



ID NO: 16) 
ID NO: 144) 
ID NO:145) 
ID NO: 146) 
ID NO:147) 
ID NO: 148) 



N0V4 

gi 1 15825377 I 
gij 128524711 
gi 1 15825379 I 
gi 1175110151 
gi 17301213 I 



NOV4 

gi 115825377 I 
gij 128524711 
gij 15825379 I 
gij 17511015 1 
gij 7301213 [ 



gi 1 15825377 I 
gij 12852471 I 
gi|15825379| 
gij 17511015 1 
gij 7301213 I 



10 



20 



30 



40 



50 



1 
1 
1 
1 
1 

61 



60 
.1 



MKKFRAKASSLPIFNGRITDATTLTTSSLQLPLGQNTQRKQSTCTRVLPTVFTITDGTTG 60 



70 



90 



110 



AASTSI*AEAMSSSKAQMPNRQESLLQLSVPRETGVGVAGP1 




150 



160 



170 



180 



NOV4 


20 


gi| 15825377 1 


20 


gij 12852471 1 


20 


gi|l5825379| 


20 


gij 17511015 j 


20 


gij 7301213 1 


121 


NOV4 


78 


gi| 15825377 | 


78 


gij 12852471 1 


78 


gij 158253791 


78 


gij 175110151 


80 


gij 7301213 1 


179 


NOV4 


131 



131 
131 
131 
140 
233 
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NOV4 


gi 


15825377 


gi 


12852471 


g± 


15825379 


gi 


17511015 


gi 


7301213 1 



340 



350 



360 



N0V4 

gi 1 15825377 I 
gi[ 12852471 I 
gi|l5825379 j 
gi 1 175110151 
gi 17301213 I 



NOV4 

gi 1 15825377 I 
gi[ 12852471 I 
gi 1158253751 
gi 1 17511015 1 
gi i 7301213 I 



N0V4 

gi 1 15825377 I 
gi I 12852471 I 
gi 115825379 I 
gi 1 17511015 1 
gi 173 012 13 I 



NOV4 

gi 1 15825377 I 
gi 1 12852471 1 
gi I 158253791 
gi 1 17511015) 
gi|7301213 | 



NOV4 

gi 1 15825377 I 
gi I 12852471 I 
gi 1 15825379 I 
gi 1 17511015 I 
gi 1 7301213 I 



N0V4 



gi 
gi 
gi 
gi 
gi 



15825377 
12852471 
15825379 
17511015 
73012131 




410 

Q 281 

[PPIASGSTGSRATSA 299 

D 281 

--HGRPGGWITST 299 
_ 288 
'6SD|LTAPVPAAAYSNVSMELELPTAQTETK 411 



450 



430 440 
....|....|....|....|....| 

281 --RAV 

300 RCRGVPRGPV- -RPAIPPPLSSVYi 

281 GS 

300 RTRGGLSSLTSSKMMHPLPLFS" 

288 

412 QLMIADTAAPHEILEKRSVI,YQIJC?^rCFSl 



480 




460 470 

.|....|....|....|....t 

292 

.NTBWQVAAGRTQKAGVTRS 356 

291 

IQVSLGRTQKMGVTKS 358 

291 

PKAVIVDVAMSDSHFVWNBD 470 




510 



500 
I 

292 --S gSgLSP 

357 grlilweapplgagSgIllpgavelpqpqfvsrflegqj 

291 

359 GRLITWBAPSVGS -gEPTLPGAVEQMQPQFI SRFLEGQi 

291 SAIlSS 

471 GSAYAWGEGTHGOlBlIaLEAWKHYP-SR ME; 



520 
I 



530 



540 



— LTflSATACTYTLlSFTIDT- 318 
SGVtJkHVACGDLfJaCLTDRG 416 




291 

SCGDLEgrCLTDRG 417 
ITYPTQSTLRPYSLSSN 316 
SACAGDGF|iLVTQAG 524 



560 



570 



N0V4 


326 


gi 


15825377 


477 


gi 


12852471 


291 


gi 


15825379 


478 


gi 


17511015 


343 


gi 


7301213 1 


585 



550 

....|....|, 
318 

417 IIMTFGSGSNGi 

291 

418 iimtfgsgsngc3ghgnfi^vtqpkiveallgyelvqvscgashv|avtn|revjs 

316 APTTHLTQIiTP ||PSHljsGF|s; 

525 SLLSCGSNAHIJ^QDEQRimiSPKI>IARLADVRVBQVAAGLQHV§ALSR|GA^ 

610 620 630 640 



600 



580 590 

LKTQ 326 

IHGNLTDI SQPTI VEALLGYEMVQVACGASHVgALSTBGEL§AWgRG 476 




326 



291 



357 



670 



690 



700 



710 



720 



680 

326 326 

53 5 LDHLSI^DEEPVPYQQVEEALSFTPLGSAPLDQEPIjIiCVDLCTAHSAAITASGDCYTFGSN 594 

291 291 

53 6 IJ>KVSGTEEPSSFCQVEEVHLFQLVQSAPIJm:KIVYIDIGTAHSVAVTEKGQCFTFGSN 595 

357 — 357 

644 FQRSAKITAFKKVQLP HKVTQACPSSTHSVFLVEGGYVYTMGRN 688 



1. 



326 



730 



740 



750 



760 



770 



780 



326 



595 QHGQLGTSSRRVSRAPCRVQGLEGIKMVMVAOSDAFTVAVGAEGEVnfSWGKGTRGRLGI^ 654 
291 -- 291 

596 QHGQI1GCSHRRSSRVPYQVSGLQG--ITMAAC6DAFTLAIGAEGEVYTWGKGARGRLGRK 653 

357 357 

689 AEGQRGIRHCNSVDHPTI.VDSVKSRYIVKANCSDQCTIVASEDNIITVWGTRN-GLPGIG 747 
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N0V4 

gi 1 15825377 I 
gij 12852471 j 
gi I 15825375] 
gi 1 17511015 1 
gij 73012 13 I 



NOV4 

gi 1 15825377 I 
gij 12852471 I 
gi 1 15825379 1 
gij 17511015 I 
gi 1 7301213 I 



790 
..|.. 



800 



810 



820 



830 



840 



326 

698 

291 

697 

357 

748 STKCGLGIK3ICTPNMELEIX3NNTAAFTNFLASVYKSEa^ILBPVDIIA^ 807 



326 

655 DEDAGLPRPVQLD ETHPYMVTSVSCCHGNTLLAVRSVTDEPVPP- 

291 

654 EEDFGIPKPVQLD ESHAFTVTSVACCHGNTLLAVKPFFEEPGPK- 

357 



850 



860 



870 



326 
698 
291 
697 
357 
808 



VQVHDVYPLaHSVLVLVDTTTPLI SSYEGDYPHL 



326 
698 
291 
697 
357 
841 



Tables 4F-G list the domain description from DOMAIN analysis results against 
NOV4. This indicates that the NOV4 sequence has properties similar to those of other 
proteins known to contain these domains. 



Table 4F. Domain Analysis of NOV4 

gnl I Smart [smart 00220, S_TKc, Serine/Threonine protein kinases, catalytic 
domain; Phosphotransferases. Serine or threonine- specific kinase subfamily. 
CD-Length = 256 residues, 99.2% aligned 
223 bits (567) , Expect « 2e-59 

YERIRWGRGAFGI VHLCLRKADQKLVI I KQI P VEQMTKEERQAAQNECQVLKLLNHPNV 6 3 

II + l + Ullil kl I 111 II I I ++II htik 

YELLEVLGKGAFGKWLARDKKTGKLVAIKVIKKEKLKKKKRERILREIKILKKLDHPNI 6 0 

IEyYENFLEDKALMTAMEYAPGGTLAEFIQKRCNSLLEEETILHFFVQILLALHH^7HTHL 123 

h I ^1 I III III ^ ++II I h + III II 

VKLYDVFEDDDKLYIiVMEYCEGGDLFDIiLKKRGR — LSEDEARFYARQILSftBEYLHSOG 118 

I LHRDIiKTQNILLDKHRMWKIGDFGI SKILSSKS - KAYTWGTPCY ISPELCEGKPYNQ 182 

hlllll HUM Ik [lt++l I I I nil II I * 

I IHRDLKPENILUDSD- GHVKIiADFGIAKQIjDSGGTLLTTFVGTPEYMAPEVLI.GKGYGK 177 

KSDIWAIiGCVLYELASLKRAFEAANLPALVLKIMSG TFAPISDRYSPELRQLVLSLL 239 

Itkll +IIII ^11+ + I + II + III + 1+ II 

AVDIWSLGVILYELLTGKPPFPGDDQLIiALFKKIGKPPPPFPPPEWKISPEAKDLIKKLL 237 



Score = 


223 


NOV 4: 


4 


Sb j Ct : 


1 


NOV 4: 


64 


Sb j ct I 


61 


NOV 4: 


124 


Sbjct: 


119 


NOV 4: 


183 


Sbjct: 


178 


NOV 4: 


240 


Sbjct: 


238 



Table 4G. Domain Analysis of NOV4 

gnl |Pfam|pfam00069, pkinase. Protein kinase domain. 
CD- Length - 256 residues, 99.2% aligned 
Score = 209 bits (533), Expect = 2e-55 

NOV 4: 4 YKRIRVVGRGAFGIVHLCXRKM)QKLVIIKQIPVE<^KEERQAAQNECQVLKLIi^ 63 

II +1 till 1+ I ++I It + + 1 1+111 + 

Sb j ct : 1 YELGEKIiGSGAFGKVYKGKHKDTGEIVAIKILKKRSL- SEKKKRFLREIQILRRLSHPNI 5 9 

NOV 4: 64 lEYYENFLEDKALMTia^EYAPGGTLAEFIQKRCNSLLEEETIIiHFFV 123 

Ml I III III ™ I II h +111 I + 

Sbjct: SO VRLLGVFEEDDHLYLVMEYMEGGDIiFDYLR-RNGLIJ:.SEKEAKKi;U:iQILRGLEYLHSRG 118 
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NOV 4: 
Sbjct : 


124 
119 


ILHRDLKTQNILLDKHRMWKIGDFGISKILSSK- - SKAYTWGTPCYISPELCEGKPYN 

himi III iik^+ II II iiii i^^iK 11^ 1+ 

IVHRDLKPENILIJ>EN-GTVKIADFGIAMa:*ESSSYEKLTTFVGTPEYMAPEVLEGRGY 


181 
177 


NOV 4: 
Sb j Ct : 


182 
178 


QKSDIWALGCVI>YELASI.KRAFEAANLPAIiVLKIMSGTF- - API SDRYSPELRQLVLSLL 

1 l + l + ll +1111 + 11-^ *l 1+ IIIO"^ 1 
SKVDVWSLGVILYELLTGKLPFPGIDPLEELFRIKERPRLRIjPIiPPNCSEELKDLIKKCI* 


239 
237 


NOV 4: 


240 


SLEPAQRPPLSHIMAQP 256 (SEQ ID NO: 149) 




Sbjct: 


238 


+ -^1 +11 i+ 1 

NKDPEKRPTi^ILNHP 254 (SEQ ID NO:151) 




Table 4H. Domain Analysis of NOV4 

gnl| Smart |sraart00219, TyrKc, Tyrosine kinase, catalytic domain; 
Phosphotransferases. Tyrosine- specific kinase subfamily. 

CD- Length = 258 residues, 96.9% aligned 
Score = 136 bits (343), Expect = 2e-33 




NOV 4: 
Sbjct: 


8 
5 


RWGRGAFGIVHLCL RKADQKLVIIKQIPVE^^TTKEERQAAQNECQVLKLLNHPNVI 

+ +1 nil 1+ + +!+!++ ++ + 1 ++++ I+III++ 
KKLGEGAFGEVyKGTIiKGKGGVEVEVAVKTLKEDASEQQ- lEEFIiREARLMRKIiDHPNIV 


64 
63 


NOV 4: 
Sb j ct : 


65 
64 


EYyENFLEDKALMTAMEyAPGGTLAEFIQKRCNSIJLiEEETIIiHFFVQILLALHHVHTHI.1 

I++ II ill III ++++I 1 +11+11 + ++ + 
KIJjGVCTEEEPLMIVMEYMEGGDIXDYLRKNRPKELSLSDLLSFALQIARGMEYLESK^ 


124 
123 


NOV 4: 
Sbjct: 


125 
124 


UaRDLKTQNILLDKHRMWKIGDFGISKILSSKSKAYTWGTPC YISPELCEGKPY 

+ 1111 +1 1+ +++ til III+++ 1 I +1 +++II **' 
VHRDLAARNCLVGENK- TVKIADFGLARDLYDD-DYYRKKKSPRLPIRWMAPESLKDGKF 


180 
181 


NOV 4: 
Sbj ct : 


181 
182 


nqksdiwalgcvlyelasl-krafeaanlpalvlkimsgtfapisdryspelrqlvlsll 

1 1 1+1+ 1 +1+1+ +1 ++++++ 1 1 1+ l+l 

tsksdvwsfgvllweiftlgespypgmsneevleylkkgyrlpqppncpdeiydlmlqcw 


239 
241 


NOV 4: 


240 


SLEPAQRPPLSHI 252 (SEQ ID NO: 152) 




Sbjct: 


242 


+ +1 II 1 + 

AEDPEDRPTFSEL 254 (SEQ ID NO: 153) 





NOV5 

A disclosed NOV5 nucleic acid (designated as CG57081-01) includes the 1591 
nucleotide sequence (SEQ ID NO: 1 7) shown in Table 5 A. An open reading frame for the 
mature protein was identified beginning with an ATG codon at nucleotides 31-33 and ending 
with a TAG codon at nucleotides 1495-1 497. The start and stop codons of the open reading 
frame are highlighted in bold type. Putative untranslated regions are underlined and found 
upstream from the initiation codon and downstream from the termination codon. 



Table 5A. NOV5 Nucleotide Sequence (SEQ ID NOrlT) 

TCCGGCTGCCGCGCGCACC(^GACCCX3GCG &TGAGGAGTGGCGCCGAGa3CAGGGGC^ 

CCCGGGCTCGCCGCCCCCCGGCCGCGCGCGCCCCGCCGGCTCCGAOSCGCCCTCGGCCCn^CCGCCGCCCGCTG 

AAGAAGAGGATGGGCrCGTCCATCTCX56CGGCCACCGCGCGGAGGCCGGTGTTTGACGACAA 
CTTCGACCACTTCCAGATCCTTCGGGCCATTGGGAAGGGCAGCTTTGGCAAGGTAGTGTGCA 
GGGACACGGAGAAGATGTACGCmTGAAGTACATGAACAAGCAGCaVGTGCATCGAGCGCXSAaSAG^ 
GTCTTCOSGGAGCTGGAGATCCTGCAGGAGATCGAGCATGTCTTCCTGQTGAACCTCTGGTATTCACT 



40 



TGAGGAGGACATGTTCATGGTGGTAGACCTGCTTCTGGGTGGAGACCTACGTTACCACCTGCAGCA 
AGTTCTCCGAGGACACAGTGAGGCTGTACATCTGCGAGATGGCACTGGCTCTGGACTACCTGCGCG<^ 
ATCATCCACAGAGATGTCAAGCCTGACAACATTCTCCTGGATGAGAGAGGACaiTGCACACCTGAC^ 
CATTGCCACCATCATCAAGGACGGGGAGCGGGOSACGGCaTTAGCAGGCACC^ 

TCTTCCACrCTTTTGTCAACGGCGGGACCGGCTACTCCTTCGAGGTGGACTGGTGGTCGGTGGGGGTGATC^ 

TATGAGCTGCTGCGAGGATGGAGGCCCTATGACATCa^CTCCAGCAAOSCCGTGGAGTCCCTGGTGCAGCTGTO 

CAGCACCGTGAGCGTCCAGTATGTCCCCACGTGGTCCAAQGA6ATG6TGGGCTTGCTGCGGAAGGTGCT 

CTGTGAACCCCGAGCACCGGCTCTCCAGCCTCa^ACGTGCAGGCAGCCCCGGCGCTGGrc^ 

GACCACCTGAGCGAGAAGAGGGTGGAGCCGGGCTTCGTGCCCAACAAAGGCCGTCTGCACTGCGACCCCACCTT 

TGAGCTGGAGGAGATGATCCTGGAGTCCAGGCCCCTGCACAAGAAGAAGAAGCGCCT6GCCAAGAACAAGTCCC 

GGGACAACAGCAGGGACAGCTCCCAGTCCGAGAATGACTATCTTCAAGACTGCCTCGATGCCATCCAGOVAG 

TTCGTGATTTTTAACAGAGAAAAGCTGAAGAGGAGCCAGGACCTCCCGAGGGAGCCTCTCCCCGCCCCTGAGTC 

CAGGGATGCTGCGGAGCCTGTGGAGGACGAGGCGGAACGCTCCGCCCTGCCCATGTGCGGCCCCaVTTTGCCCCT 

CGGCCGGGAGCGGCTA GGCCGGGACGCCCGTGGTCCTCACCCCTTGAGCTGCTTTGGAGACTCGGCTGCCAGAG 

GGAGGGCCATGGGCCGAGGCCTGGCATTCACGTTCCC 



The nucleic acid sequence of NOV5 maps to chromosome 10 and has 1338 of 1549 
bases (86%) identical to a gb:GENBANK-ID:AB041542|acc:AB041542.1 mRNA from Mus 
musculus (Mus musculus brain cDNA, clone MNCb-1563, similar to AJ250840 
serine/threonine protein kinase (Mus musculus)) (E = 1 .9e'^^^). 

A disclosed NOV5 polypeptide (SEQ ID NO: 18) is 488 amino acid residues and is 
presented using the one letter code in Table 5B. Signal P, Psort and/or Hydropathy results 
predict that NOV5 does not have a signal peptide and is likely to be localized to the nucleus 
with a certainty of 0.7000. In other embodiments, NOV5 is localized to the microbody 
(peroxisome) with a certainty of 0.3058, the mitochondrial matrix space with a certainty of 
0.1000 or the lysosome (lumen) with a certainty of 0.1000. 



Table 5B. Encoded NOV5 Protein Sequence (SEQ ID NO:18) 

MRSGAERRGSSAAASPGSPPPGRARPAGSDAPSAI.PPPAAGQPRARDSGDVRSQPRPLFQWSKWKKRMGSSMSA 

ATARRPVFDDKEDVNFDHFQILRAIGKGSFGKWCIVQKRDTEKMYAMKYMNKQQCIERDEVRNVFi^ 

lEHVFIiVNIiWYSFQDEEDMFMVVDLLLGGDLRYHLQQNVQFSEDTVRIiYICEMAI^ 

ILLDERGHAHLTDFNIATI IKDGERATALAGTKP YMAPEI FHSFVNGGTGYSFEVDWWSVGVMAYELLRGWRPY 
DIHSSNAVESLVQLFSTVSVQYVPTWSKEMVGLLRKVLLTVNPEHRLSSLQDVQAAPAIAGVLWDHIi 
GFVPNKGRLHCDPTFELEBMILBSRPLHKKKKRIAKNKSRDNSRDSSQSENDYLQDCLDAIQQDFVIFN^ 
RSQDLPREPIiPAPESRPAAEPVEPEAERSALPMCGPICPSAGSG 



The NOV5 amino acid sequence was found to have 442 of 487 amino acid residues 
(90%) identical to, and 458 of 487 amino acid residues (94%) similar to, the 488 amino acid 
residue ptnr:SPTREMBL-ACC:Q9JJG4 protein from Mus musculus (Mouse) (BRAIN 
CDNA, CLONE MNCB-1563, SIMILAR TO AJ250840 SERINE/THREONINE PROTEIN 
KINASE (MUS MUSCULUS)) (E = Lle"^^^). 

NOV5 is expressed in at least the following tissues: brain, kidney, liver, pancreas, 
peripheral blood, prostate, testis, thalamus, thymus, uterus, lymph node, lymphoid tissue, 
bone marrow, and spleen. Expression information was derived from the tissue sources of the 
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sequences that were included in the derivation of the sequence of NOV5. The sequence is 
predicted to be expressed in the following tissues because of the expression pattern of 
(GENBANK-ID: gb:GENBANK-ID:AB041542|acc:AB041542.1) a closely related Mus 
musculus brain cDNA, clone MNCb-1563, similar to AJ250840 serine/threonine protein 
kinase (Mus musculus) homolog in species Mus musculus: brain, 

NOV5 also has homology to the amino acid sequences shown in the BLASTP data 
listed in Table 5C. 



Table 5C. BLAST results for NOV5 




Gene Index/ 
Identifier 


Protein/ Organism 


Length 
<aa) 


Identity 
(%) 


Positives 
(%) 


Expect 


gi 1 10946600 ( ref 1nP_ 
067277. l| 
(NM__021302) 


hypothetical 
serine/ threonine 
protein kinase 
[Mus tmsculus] 


488 


441/489 
(90%) 


457/489 
(93%) 


0.0 


gi 1 17453579 | ref | XP_ 
058348. l| 
{XM_058348) 


similar to 
Unknown (protein 
for MGC: 23665) 

(H. sapiens) 
[Homo sapiens] 


369 


368/370 
(99%) 


368/370 
(99%) 


0.0 


gi |l3358640|dbj |BAB 
33045. l| {AB056389) 


hypothetical 
protein [Macaca 
f ascicularis] 


368 


357/370 
(96%) 


360/370 
(96%) 


0.0 


gi 1 8923754 | ref |NP_0 
60871. l| 
(NM_018401) 


gene for 
s er ine / threonine 
protein kinase 
[Homo sapiens] 


414 


261/368 
(70%) 


314/368 
(84%) 


e-161 


gi 1 7161864 | emb | CAB7 
6566. l| (A«J250840) 


serine /threonine 
protein kinase 
[Mus musculus] 


414 


260/368 

(70%) 


317/368 
(85%) 


e-160 



The homology of these sequences is shown graphically in the ClustalW analysis 
shown in Table 5H. 



Table 5D. ClustalW Analysis for NOV5 

1) N0V5 (SEQ ID NO: 18) 

2) gi 1 10946600 I (SEQ ID NO:154) 

3) gi|l7453579| (SEQ ID NO:155) 

4) gijl3358640| (SEQ ID NO:156) 

5) gi 18923754 I (SEQ ID NO: 157) 

6) gil7l61864i (SEQ ID NO:158) 

10 20 30 40 50 60 

1 MRSGflERRGSSAAASPGSPPPGRARPAGSDAPSALPPPAAGQPRaRDSGDVRSQPRPLFQ 60 

1 MRSGAERRGSSAAftPPSSPPPGRARPaGSEVSPaLPPPAASQPRaRDAGDARAQPRPI.FQ 60 

1 1 

1 1 

1 -- 1 

1 -- 1 

70 80 90 100 110 120 
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NOV5 

gi jl0946600| 
gi 117453579 1 
gi I 13358640 j 
gi I 8923754 I 
gi 1 7161864 I 



N0V5 

gi|l0946600| 
gi 1174535791 
gi 1 13358640 I 
gi I 8923754 I 
gi 1 71618641 



N0V5 

gi 1 10946600 I 
gi 117453579 1 
gij 13358640 I 
gi 18923754 I 
gij 7161864 I 



's|3?VFDi5E|VT<iFDKFQILRAIGKGSFGm^CIVQKRDTg^^^ 
^SpVFD^ElVNFDHFQILRAIGKGSFGBgvCIVQKRDT^KK 



119 
119 
1 
1 
48 
48 



NOV5 


180 


gi|l0946600| 


180 


gi 1174535791 


62 


gi 1 13358640 1 


62 


gij 89237541 


109 


gij 7161864 1 


109 


NOV5 


24 0 


gi| 109466001 


240 


gij 174535791 


122 


gi| 133586401 


122 


gi| 89237541 


169 


gij 716186^1 


169 



gi 1 10946600 1 
gij 17453579 1 
gij 13358640 1 
gij 8923754 I 
gij 7161864 [ 



NOV5 

gi 1 10946600 I 
gij 17453579 I 
gi|l3358640i 
gij 8923754 1 
gij 7161864 i 




250 
! 



260 
. . I . , 



270 



280 



290 



300 
••I 



TI I KDGERATALAGTKPYMAPE I FKSFVNGGTGYS FEVDWTJSiGVMAy ELLRGWRPYD IK 
T 1 1 KDGERATALAGTKPYMAPE I FHS FWGGTGY S FEVDWWs|gVMAYELLRGWRPYD IH 
TIIKDGSRATAIJ^GTKPyi^PEIFHSFWGGTGYSFEVDWWsiGVMAYELLRGWRPYDIK 
TI IKDGERATAIAGTKPYT^APEI FHS FWGGTGYSFEVDWWsiGVMAYELLRGWRPYDIH! 

"^^k^^^^a^Ia^*^^^"^"^^^^^^^^^^ 



^GgGYSf 

Igg SgysI 



^YELLRGWRPYIJIK 



299 
299 
181 
181 
228 
228 



360 



NOV5 


300 


gi |l0946600l 


300 


gij 17453579 1 


182 


gijl3358640i 


182 


gij 8923754 1 


229 


gij 7161864 j 


229 


NOV5 


360 



359 
241 
241 
288 
288 



420 
419 
301 
301 
347 
347 






359 




358 




240 




240 




287 




287 


420 




419 




418 




300 




300 


P- 


346 


P- 


346 


480 



490 



NOV5 


479 


gi 


10946600 


479 


gi 


17453579 


360 


gi 


13358640 


359 


gi 


8923754) 


405 


gi 


7161864 1 


405 
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Tables 5E-G list the domain description from DOMAIN analysis results against 
NOV5. This indicates that the NOV5 sequence has properties similar to those of other 
proteins known to contain these domains. 



Table 5E. Domain Analysis of NOV5 

gnl [Smart I smart 00 220, S_TKc, Serine /Threonine protein kinases, catalytic 
domain; Phosphotransferases. Serine or threonine- specific kinase subfamily. 
CD-Length = 256 residues, 98.4% aligned 
230 bits (587), Expect = le-61 

FQILRAIGKGSFGKWCIVQKRDTEKMYAMKYMNKQQCIERDEVRNVFRELEILQEIEHV 
yELLEVLGKGAFGKVyi,-ARDKKTGmVAIKVIKKEK-I.KKKKRERILREIKILKK^ 

FLVNIiWYSFQDEEMFMVVDLIiLGGDLRYHIiQQ]WQFSEDTVRLYI^^ 

+ 1 1+ I + nil + III M ll + ll I 

NIVKLYDVFEDDDKLyLVMEYCEGGDr.FDLLKKRGRLSEDEARFYARQIXiSALEYLHSQG 



Score - 


230 


NOV 5: 


93 


Sbjctt 


1 


NOV 5: 


153 


Sbjct: 


59 


NOV 5: 


213 


Sbjct: 


119 


NOV 5: 


272 


Sbjct: 


174 


NOV 5: 


331 


Sb j Ct : 


234 



152 



58 



212 



118 



I IHRDVKPDNILLDERGHAHLTDFNI ATII KDG - ERATALAGTKPYMAPEIFHSFVNGGT 271 

Mllkll-lllll II I II +1 + I I H lllll^ i 

IIHM^LKPENILLDSDGHVKIJU^FGIAKQLDSGGTLLTTFVGTPEyMAPEVLL GK 173 



GYSFEVDWWSVGV^aAYELLRGWRPYDIHSS-NAVESLVQLFSTVSVQYVPTWSKEMVGLL 

II II lhll+ Mil Ik h + Ilk 

GYGKAVDIWSLGVXLYKLLTGKPPFPGDDQLLALFKKIGKPPPPFPPPEWKISPEAKDLI 



+ 111 +illk+ 



330 



233 



Table 5F. Domain Analysis of NOV5 

gnl |Pf ara|pfam00069, pkinase, Protein kinase domain. 
CD- Length = 256 residues, 97.3% aligned 



Score = 


200 


bits (508) , Expect = 2e-52 




NOV 5: 


93 


FQILRAIGKGSFGKVVCIVQiOUDTEKMYAMKYMNKQQCIERDEVRNVFRELEILQEIEHV 


152 






+1--^ kllll + +11 kl + k 1 + + ll+-*-lk + 1 




Sb j ct : 


1 


YELGEKLGSGAFGKVY-KGKHKDTGEIVAIKILKKRSLSE — KKKRFLREIQILRRLSHP 


57 


NOV 5: 


153 


FLVNLWYSFQDEEDMFiyrVVDLLLGGDLRYHLQQN-VQFSEDTVRLYICEMALALDYLRGQ 


211 






+ 1 1 1++++ +++!++ + itii + II + ++ kll + 




Sbj Ct : 


58 


NIVRLLGVFEEDDHLYLVMEYMEGGDLFDYLRRNGLLLSEKEAKKIALQILRGLEYLHSR 


117 


NOV 5: 


212 


HI IHRDVKPDNI LLDERGHAHLTDFNIATI IK- -DGERATALAGTKPYMAPEIFHSFVNG 


269 






kllklkllllll 1 + II +1 k 1 II mik 




Sbj Ct : 


118 


GIVHRDLKPENXLLDENGTVKIADFGLARKLESSSYEKLTTFVGTPEYMAPEVL E 


172 


NOV 5: 


270 


GTGYSFEVDWWSVGVMAYELLRGWRPY-DIHSSNAVESLVQLFSTVSVQYVPTWSKEMVG 


328 






MM +11 Iklk nil 1 k 1 + + + + + 1 l^k 




Sbj Ct : 


173 


GRGYSSKVDVWSLGVILYELLTGKLPFPGIDPLEELFRIKE-RPRLRLPLPPNCSEELKD 


231 



NOV 5: 329 LLRKVLLTVNPEHRLSSLQ 347 (SEQ ID NO: 161) 



+ + + 



Sbjct: 232 LIKK-CLNKDPEKRPTAKE 249 (SEQ ID NO: 162) 



Table 5G. Domain Analysis of NOV5 

gnl I Smart I smart 0021 9, TyrKc, Tyrosine kinase, catalytic domain; 
Phosphotransferases. Tyrosine -specific kinase subfamily. 
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CD- Length = 


258 residues, 83.7% aligned 




Score = 


100 


bits (250), Expect = le-22 




NOV 5: 


95 


ILRAIGKGSFGKW- - C I VQKRDTEKMYAMKYMNKQQCI ERDEVRNVFRELEI LQEI EHV 


152 










Sb j ct ; 


3 


LGKKLGEGAFGEVYKGTLKGKGGVEVEVAVKTLKEDASEQ~-QIEEFLREARLMRKI*DHP 


60 


NOV 5: 


153 


FLVNLW YSFQDEEDMFM WDLLLGGDLRYHLQQN- - VQFSEDTVRLYI CEMALALDYLRG 


210 






+11 +11+ +I++ + mi +1 + + ++I ++M 




Sbjct : 


61 


NIVKLLGVCTEEEPLMIVMEYMEGGDLLDYLRKNRPKELSLSDLLSFALQIARGMEYLES 


120 


NOV 5: 


211 


QHI IHRDVKPDNI LLDERGHAHLTDFNI ATI IKDGE - RATALAGTKP - - YMAPEI FHSFV 


267 






++ +111+ 1 h 1 + 11 +1 ^ t ^ ^ 1 +1111 




Sb j ct : 


121 


KNFVHRDIiAARNCLVGENKTVKIADFGLARDLYDDDYYRKKKSPRLPIRWMAPESLKDGK 


180 


NOV 5: 


268 


NGGTGYSFEVDWWSVGVMAYELL-RGWRPYDIHSSNAVESLVQ 309 (SEQ ID NO: 


163) 






++ + 1 IMh -^1- III k 1 ++ 




Sbj ct : 


181 


FTSKSDVWSFGVLLWBIFTLGESPYPGMSNEEVLEYLK 218 (SEQ ID NO: 


164) 



Eukaryotic protein kinases are enzymes that belong to a very extensive family of 
proteins which share a conserved catalytic core common with both serine/threonine and 
tyrosine protein kinases* Protein phosphorylation is a fundamental process for the regulation 

5 of cellular functions. The coordinated action of both protein kinases and phosphatases 

controls the levels of phosphorylation and, hence, the activity of specific target proteins. One 
of the predominant roles of protein phosphorylation is in signal transduction, where 
extracellular signals are amplified and propagated by a cascade of protein phosphorylation 
and dephosphorylation events. Two of the best characterized signal transduction pathways 

10 involve the cAMP-dependent protein kinase and protein kinase C (PKC). Each pathway uses 
a different second-messenger molecule to activate the protein kinase, which, in turn, 
phosphorylates specific target molecules. Extensive comparisons of kinase sequences 
defined a common catalytic domain, ranging from 250 to 300 amino acids. This domain 
contains key amino acids conserved between kinases and are thought to play an essential role 

15 in catalysis. In the N-terminal extremity of the catalytic domain there is a glycine-rich stretch 
of residues in the vicinity of a lysine residue, which has been shown to be involved in ATP 
binding. In the central part of the catalytic domain there is a conserved aspartic acid residue 
which is important for the catalytic activity of the enzyme. 

Protein kinases and phosphatases regulate cell-cycle progression, transcription, 

20 translation, protein sorting and cell adhesion events that are critical to the inflammatory 

process. Two of the best-characterized immunosuppressants, cyclosporin and rapamycin, are 
also effective anti-inflammatory drugs. They act directly on protein phosphorylation and, as 
such, validate the concept that small-molecule modulators of phosphorylation cascades 
possess anti-inflammatory properties. Some examples of the role of serine/threonine protein 
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kinases that are important in cell proliferation and disease include AKT, RAFl and PIMl. 
Dudek et ah demonstrated that AKT is important for the survival of cerebellar neurons. 
Thus, the ^orphan* kinase moved center stage as a crucial regulator of life and death decisions 
emanating from the cell membrane. Holland et al. transferred, in a tissue-specific manner, 
genes encoding activated forms of Ras and Akt to astrocytes and neural progenitors in mice. 
These authors found that although neither activated Ras nor Akt alone was sufficient to 
induce glioblastoma multiforme (GBM) formation, the combination of activated Ras and Akt 
induced high-grade gliomas with the histologic features of human GBMs. These tumors 
appeared to arise after gene transfer to neural progenitors, but not after transfer to 
differentiated astrocytes. Increased activity of Ras is found in many human GBMs and Akt 
activity is increased in most of these tumors, implying that combined activation of these 2 
pathways accurately models the biology of this disease. Another disease that involves yet 
another serine/threonine kinase is Peutz-Jeghers syndrome (PJS) , an autosomal dominant 
disorder characterized by melanocytic macules of the lips, buccal mucosa, and digits, 
multiple gastrointestinal hamartomatous pol3T>s, and an increased risk of various neoplasms. 
Jenne et al. identified and characterized the serine/threonine kinase STKl 1 and identified 
mutations in PJS patients. All 5 germline mutations were predicted to disrupt the function of 
the kinase domain. They concluded that germline mutations in STKl 1, probably in 
conjxmction vrith acquired genetic defects of the second allele in somatic cells according to 
the Knudson model, caused the manifestations of PJS. These authors commented that PJS 
was the first cancer susceptibility syndrome identified that is due to inactivating mutations in 
a protein kinase and found mutations in the STKl 1 gene in 1 1 of 12 unrelated families with 
PJS. Ten of the 1 1 were truncating mutations. All were heterozygous in the germline. Su et 
al. found that of 53 PJS patients with cancer reported to that time, 6(11%) were diagnosed 
with pancreatic adenocarcinoma. Su et aL presented evidence that the STKl 1 gene plays a 
role in the development of both sporadic and familial (PJS) pancreatic and biliary cancers. 
They found that in sporadic cancers, the STKl 1 gene was somatically mutated in 5% of 
pancreatic cancers and in at least 6% of biliary cancers examined. In the patient with 
pancreatic cancer associated with PJS, there was inheritance of a mutated copy of the STKl 1 
gene and somatic loss of the remaining wild type allele. See: Hunter, (1991) Meth. Enzymol. 
200: 3-37; Taylor et al, (1991) Science 253: 407-414; Bhagwat et al, (1999) Oct;4(10):472- 
479; Dudek et al, (1997) Science 275: 661 -663; Holland et al, (2000) Nature Genet. 25: 55- 
57; Jenne et al, (1998) Nature Genet. 18: 38-43; and Su et al, (1996) J. Biol. Chem. 271: 
14430-14437. 
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The novel human serine/threonine protein kinase of the mvention contains a protein 
kinase domain. Therefore it is anticipated that this novel protein has a role in the regulation 
of essentially all cellular functions and could be a potentially important target for drugs. 
Such drugs may have important therapeutic applications, such as treating numerous 
inflammatory diseases. 

The protein similarity information, expression pattern, cellular localization, and map 
location for the NOV4 and NOV5 proteins and nucleic acids disclosed herein suggest that 
these Ser/Thr Protein Kinase-like proteins may have important structural and/or physiological 
functions characteristic of the Protein Kinase family. Therefore, the nucleic acids and 
proteins of the invention are useful in potential diagnostic and therapeutic applications and as 
a research tool. These include serving as a specific or selective nucleic acid or protein 
diagnostic and/or prognostic maricer, vy^herein the presence or amount of the nucleic acid or 
the protein are to be assessed. These also include potential therapeutic applications such as 
the following: (i) a protein Aerapeutic, (ii) a small molecule drug target, (iii) an antibody 
target (therapeutic, diagnostic, drug targeting/cytotoxic antibody), (iv) a nucleic acid useful in 
gene therapy (gene delivery/gene ablation), (v) an agent promoting tissue regeneration in 
vitro and in v/vo, and (vi) a biological defense weapon. 

The nucleic acids and proteins of the invention have applications in the diagnosis 
and/or treatment of various diseases and disorders. For example, the compositions of the 
present invention will have efficacy for the treatment of patients suffering from: Systemic 
lupus erythematosus. Autoimmune disease. Asthma, Emphysema, Scleroderma, Cancer, 
Fertility disorders. Reproductive disorders, Tissue/Cell growth regulation disorders. 
Developmental disorders as well as other diseases, disorders and conditions. 

These materials are further useful in the generation of antibodies that bind 
immunospecifically to the novel substances of the invention for use m therapeutic or 
diagnostic methods. These antibodies may be generated according to methods known in the 
art, using prediction from hydrophobicity charts, as described in the '*Anti-NOVX 
Antibodies" section below. For example, the disclosed NOV4 and NOV5 proteins have 
multiple hydrophilic regions, each of which can be used as an immunogen. In one 
embodiment, a contemplated NOV4 epitope is from about amino acids 40 to 52, In another 
embodiment, a contemplated NOV4 epitope is from about amino acids 60 to 65. In other 
specific embodiments, contemplated NOV4 epitopes are from about amino acids 90 to 1 10, 
120 to 135, 160 to 168, 210 to 212, 260 to 275 and 310 to 315. In one embodiment, a 

contemplated NOV5 epitope is from about amino acids 45 to 55. In another embodiment, a 
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contemplated NOV5 epitope is from about amino acids 120 to 150. In other specific 
embodiments, contemplated NOV5 epitopes are from about amino acids 160 to 170, 215 to 
225, 280 to 3 1 0, 350 to 375, 390 to 420 and 440 to 455. 

NOV6 

A disclosed NOV6 nucleic acid (designated as CuraGen Acc. No. CG56684-02), 
encodes a novel Glycodelin-like protein and includes the 581 nucleotide sequence (SEQ ID 
NO: 19) shown in Table 6A. An open reading frame for the mature protein was identified 
beginning with an ATG codon at nucleotides 36-38 and ending with a TAG codon at 
nucleotides 549-551. Putative untranslated regions downstream from the termination codon 
and upstream from the initiation codon are underlined in Table 6A, and the start and stop 
codons are in bold letters. 



Table 6A. NOV6 Nucleotide Sequence (SEQ ID NO:19) 

CACTCCAGAGCTCAGAGCCACCCACAGCCa.CAGCTA TGCAGTGCCTCCTGCTCACCCTGAGCATGGCCCTGGTC 
TGTGCCATCCAGGCCAGGGACATCCCCCAGACCAAGCAGGACGTGGAGCTCCCAAAGTTGGCAGGGACCTGGTA 
CTCCATGGCCATGGTGGCCAGTGACTTCTCCCrCCTGGAGACCGTGGAGGCCCCTCTGAGGGTCAACATCACCT 
CGCTGTGGCCCACCCCCGAGGGCAACCTGGAGATCATTCTGCACAGATGGGAACACCACAGATGCGTTGAGAGG 
ACCGTCCTCGCCCAGAAGACTGAGGACCCGGCTGTGTTCATGGTCGACCGTAGCAGGAGCTACGTGTTCTTCTG 
CATGGGGACCACCACACCCAGTGCTGACCACCACACGATGTGCCAGTACCTGGGGATGACAGCCAGGACCCTAG 
AGGCAGACGACAAGGTCATGGAGGAATTCATCAGCTTTCTCAGGACCCTGCCCGTGCACATGTGGATCTTCCTG 
GACGTTACCCAGGCXSGAACAGTGCCGCGTCTAQATGAGCTCCTOCTCAGTCCTGC 

The nucleic acid sequence of NOV6 maps to chromosome 9 has 293 of 346 bases 
(84%) identical to a gb:GENBANK-ID:HUMENDOA2|acc:M6 1886.1 mRNA from Homo 
sapiens (Human pregnancy-associated endometrial alpha2-globulin mRNA, complete cds) (E 
= L4e^^. 

A disclosed NOV6 polypeptide (SEQ ID NO:20) is 171 amino acid residues in length 
and is presented using the one-letter amino acid code in Table 6B. The SignalP, Psort and/or 
Hydropathy results predict that NOV6 has a signal peptide and is likely to be localized 
outside of the cell with a certainty of 0.5899. In alternative embodiments, a NOV6 
polypeptide is located to the microbody (peroxisome) with a certainty of 0.1391, the 
endoplasmic reticulum (membrane) with a certainty of 0.1000, or the endoplasmic reticulum 
(lumen) with a certainty of 0. 1 000. The SignalP predicts a likely cleavage site for a NOV6 
peptide between amino acid positions 1 8 and 1 9, Le. at the sequence IQA-RD. 
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Table 6B. Encoded NOV6 Protein Sequence (SEQ ID NO:20) 

hrwehhrcvertviaqktedpavemvdrsrsyvffcmgtttpsadhh™ 

lpvhmwi fldvtqaeqcrv 

The NOV6 amino acid sequence was found to have 1 10 of 186 amino acid residues 
(59%) identical to, and 132 of 186 amino acid residues (70%) similar to, the 186 amino acid 
residue ptnr:SPTREMBL-ACC:07751 1 protein from Papio cynocephalus (Yellow baboon) 
(BETA-LACTOGLOBULIN I) (E = 3.2e"*^). 

NOV6 is expressed in at least the following tissues because of the expression pattern 
of (GENBANK-ID: gb:GENBANK-ID:HUMENDOA2[acc:M61 886.1) a closely related 
Human pregnancy-associated endometrial alpha2-globulin mRNA, complete cds homolog in 
species Homo sapiens: endometrium, amnion, and in semen. 

NOV6 has homology to the amino acid sequences shown in the BLASTP data listed 
in Table 6C. 



Table 6C. BLAST results for NOV6 


Gene Index/ 
Identifier 


Protein/ Organism 


Length 
(aa) 


Identity 
(%) 


Positives 
(%) 


Expect 


gi| 17468008] re 
f |XP_070794.l| 
{XM_070794) 


similar to 

hypothetical protein 
(H . sapiens ) [Homo 
sapiens] 


187 


131/180 
(72%) 


131/180 
(72%) 


2e-63 


gi|3483096|gb| 
AAC33251. 1 1 
{AF021261) 


beta-lactoglobulin X 
[Papio cynocephalus] 


186 


112/192 
(58%) 


136/192 
(70%) 


2e-49 


gi|l3070l|sp|P 
09466 |PAEP__HUM 
AN 


Glycodelin precursor 
( GD ) ( Pr egn ancy- 
associated 
endometrial alpha- 2 
globulin) (PEG) 
(PAEG) (Placental 
protein 

14) (Progesterone- 

associated 

endometrial 

protein) (Progestagen 

-associated 

endometrial protein) 


180 


98/184 
(53%) 


127/184 
(68%) 


7e-44 


gi| 4884164 |etnb 

|CAB43305.l| 

(AL050169) 


hypothetical protein 
[Homo sapiens] 


188 


98/184 
(53%) 


127/184 
(68%) 


le-43 


gi 1 125905 |sp|P 
21664 |LACA_FEI* 
CA 


BETA-LACTOGIiOBULIN 
II 


163 


85/164 
(51%) 


112/164 
(67%) 


2e-37 



The homology of these sequences is shown graphically in the ClustalW analysis 
shown in Table 6D. 
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Table 6D. CliistalW Analysis of NOV6 



% 3 



r'ii 



£3 



1) 

2) 
3) 
4) 
5) 

6) 



NOV6 

gi| 17468008 | 
gi(3483096| 
gi 1 130701] 
gi|4884164| 

gi I 125905 | 



(SEQ ID NOr20) 
{SEQ ID NO: 165) 
(SEQ ID NO: 166) 
{SEQ ID NO: 167) 
{SEQ ID NO: 168) 
(SEQ ID NO: 169) 



N0V6 

gi 1 17468008 I 
gx|3483096| 
gi I 130701 I 
gi 14884164 I 
gi 1 125905 I 



NOV6 

gi I 17468008 1 
gi I 3483096 I 
gi 1 130701 1 
gi I 4884164 I 
gi 1 125905 I 



NOV6 

gi|l7468008| 
gi|3483096| 
gi |130701| 
gi [4884164 | 
gi 1 1259051 




105 SRSYVFF" 
105 RICRAA'' 
113 LDBNRI 
105 TYANEATj 
113 TVANEAT| 
87 QGEKKIS- 



190 



200 



NOV6 


148 


SFL 


Itlp 


gi [17468008 1 


160 


CMG 


|tSp 


gi [3483096 j 


162 


SFL 


|tlf 


giil3070l| 


156 


RAF 




gi|4884164| 


164 


RAF 




gi 1259051 


139 


RAL 


Itlf 




Table 6E list the domain description from DOMAIN analysis results against NOV6. 
This indicates that the NOV5 sequence has properties similar to those of other proteins 
known to contain these domains. 
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Table 6E. Domain Analysis of NOV6 

gnl [Pf am|pfam00061, lipocalin, Lipocalin / cytosolic fatty- acid binding 
protein family. Lipocalins are transporters for small hydrophobic molecules, 
such as lipids, steroid hormones, bilins, and retinoids. Alignment subsumes 
both the lipocalin and fatty acid binding protein signatures from PROSITE, 
This is supported on structural and functional grounds. Structure is an eight- 
stranded beta barrel. 

CD-Length = 145 residues, 100.0% aligned 
Score = 87.8 bits (216), Expect = 5e-19 

NOV 6: 32 KLAGTWYSMAIWASDFSLLETVEAPLRWITSLWPTPEGNLEIILHRWEHm 91 

I II II H I II + 1 ^ \ lllllh +^ 1 1 

Sbjct: 1 KFAGKWYLVASANFDPELKEEL-GVLEATIOCEITPIiKEGNLEIVFDGDKNGICEETFGKI. 59 

NOV 6: 92 QKTEDPAVFMVDRSR SYVFFCMGTTTPSADHHTMCQYLGMTARTLEAD 139 

+ |k I + +1+ k + + II I 

Sbjct: 60 EKTKKLGVEFDYYTGDNRFWLDTDYDNyiiLVCVQ-KGDCaTETSRTAELY GRTPELS 115 

NOV 6: 140 DKVMEEFISFLRTLPVHMWIFLDVTQAEQC 169 (SEQ ID NO: 170) 

Sbjct: 116 PEALELFETATKELGIPEDNWCTRQTERC 145 (SEQ ID NO: 171) 



The protein of the invention exhibits sequence similarity to glycodelin and members 
of the lipocalin family, whose properties are described below. Based on the similarity to 
these proteins, the invention is likely to possess similar expression pattern, properties, or 
physiological function or role in disease. Placental protein-14 is synthesized by the human 
secretory endometrium and decidua. It is abundantly secreted by the human endometrium 
under the influence of progesterone. Julkunen et ah (1988) isolated cDNA clones 
corresponding to PPM is encoded by a 1-kilobase mRNA that is expressed in secretory 
endometrium and decidua but not in postmenopausal endometrium, placenta, liver, kidney, 
and adrenals. The 162-residue-long sequence of PPM is highly homologous to beta- 
lactoglobulin, the main component of equine, bovine, and ovine milk whey. Morris et al 
(1996) reported that PPM, which they called glycodelin (Gd), exists as 2 gender-specific 
forms that differ in their glycosylation patterns. GdA, found in amniotic fluid, inhibits 
sperm-zona pellucida binding in an established sperm-egg binding system; GdS, found in 
seminal plasma, does not. Both forms suppress responses by a variety of immune effector 
cell types. 

Lipocalins are a group of extracellular proteins, first described by Pervaiz and Brew 
(1987), that are able to bind lipophiles by enclosure within their structures, minimizing 
solvent contact. Based on the known 3-dimensional structure of 5 members of the lipocalin 
family, i.e., retinol binding protein, beta-lactoglobulin, bilin binding protein, mouse major 
urinary protein, and rat urinary alpha-2-globulin, the general architecture appears to be highly 
appropriate for binding a variety of hydrophobic Hgands. On the basis of highly conserved 
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amino acid sequences and of a size around 1 8 to 20 kD, about 20 proteins have been 
designated as lipocatins. In tear fluid, a group of 6 proteins with molecular weights ranging 
from 15 to 20 kD and various isoelectric points are abundant. The N-terminal sequences of 
these proteins led Lassagne and Gachon (1993) to hypothesize that they are isoforms and 
belong to the lipocalin family. Tear prealbumin cDNA (Redl et al. (1992)) from lacrimal 
gland encodes a 176-amino acid protein that shares 58% identity to the von Ebner gland 
protein of the rat and significant homology with other lipocalins including beta lactoglobulin. 
From genetic and biochemical data, tear prealbumin is considered a member of the lipophilic- 
ligand carrier protein superfamily. Though tear prealbumin was originally described as a 
tear-specific protein, Redl et al (1992) showed that tear prealbumin-specific antiserum 
reacted with human saliva, sweat, and nasal mucus proteins. 

Von Ebner glands (VEG) are small lingual salivary glands. Their ducts open into 
trenches of circum vallate and foliate papillae, and their secretions influence the milieu where 
the interaction between taste receptor cells and sapid molecules ('sapid* means 'possessing 
taste') takes place. The major secretion of human VEG is a protein with a molecular mass of 
18 kD. This VEG protein is identical to lipocalin-1. Blaker et al (1993) isolated a cDNA 
clone from a human VEG library and showed that it contained an insert of 735 bp, including 
an open reading firame that encodes the human VEG protein of 176 amino acids. The VEG 
proteins are members of the lipocalin protein superfamily; together with odorant-binding 
protein, they constitute a new subfamily. Sequence similarity to proteins such as retinol 
binding protein and odorant binding protein suggests a possible function for the human VEG 
protein in taste perception. 

Other members of the lipocalin family include: orosomucoid, alpha-1 -microglobulin, 
progestagen-associated endometrial protein, the gamma chain of C8, and prostaglandin D2 
synthase. 

Using Northern blotting and immunohistology, Holzfeind etal (1996) found that 
LCNl is expressed in the human prostate. Cloning and sequencing showed that the transcript 
is identical to that found in tears. This finding suggested to Holzfeind et al (1 996) that the 
lipocalin-l protein is not specific to tears and saliva, as was previously believed, but is 
multifunctional. 

Van't Hof et al (1997) showed that LCNl inhibits the cysteine-protease papain in 
vitro, similar to cystatins (see 123857). They suggested that LCNl plays a role in the 
nonimmunologic defense and in the control of inflammatory processes in oral and ocular 
tissues. 
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Redl et al (1998) found enhanced LCNl secretion in the airways of patients with 
cystic fibrosis (CF; 219700). Northern blot analysis of RNA from normal trachea and RNA 
isolated from tracheal biopsies of patients with CF indicated that the enhanced secretion was 
due to an upregulated expression of the LCNl gene. Thus, the investigations presented the 
first clear evidence that LCNl is induced in infection or inflammation and supported the idea 
that this lipocalin functions as a physiologic protection factor of epithelia m vivo. 

The protein similarity information, expression pattern, and map location for the 
Glycodelin-like protein and nucleic acid disclosed herein suggest that this Glycodelin may 
have important structural and/or physiological functions characteristic of the Lipocalin 
family. Therefore, the nucleic acids and proteins of the invention are useful in potential 
diagnostic and therapeutic applications and as a research tool. These include serving as a 
specific or selective nucleic acid or protein diagnostic and/or prognostic marker, wherein the 
presence or amount of the nucleic acid or the protein are to be assessed, as well as potential 
therapeutic applications such as the following: (i) a protein therapeutic, (ii) a small molecule 
drug target, (iii) an antibody target (therapeutic, diagnostic, drug targeting/cytotoxic 
antibody), (iv) a nucleic acid useful in gene therapy (gene delivery/gene ablation), and (v) a 
composition promoting tissue regeneration in vitro and in vivo (vi) biological defense 
weapon. 

The NOV6 nucleic acids and proteins of the invention are useful in potential 
diagnostic and therapeutic applications implicated in various diseases and disorders described 
below and/or other pathologies. For example, the compositions of the present invention will 
have efficacy for treatment of patients suffering from: infertility, endometriosis, other 
reproductive health disorders, lachrymal disorders, cancer, inflammation, autoimmune 
diseases and other diseases, disorders and conditions of the like. 

The novel NOV6 nucleic acid encoding the Glycodelin-like protein of the invention, 
or fragments thereof, are useful in diagnostic applications, wherein the presence or amount of 
the nucleic acid or the protein are to be assessed. These materials are further useful in the 
generation of antibodies that bind immunospecifically to the novel substances of the 
invention for use in therapeutic or diagnostic methods. These antibodies may be generated 
according to methods known in the art, using prediction from hydrophobicity charts, as 
described in the ''Anti-NOVX Antibodies" section below. The disclosed NOV6 protein has 
multiple hydrophilic regions, each of which can be used as an immunogen. In one 
embodiment, a contemplated NOV6 epitope is from about amino acids 25 to 35. In another 
embodiment, a contemplated NOV6 epitope is from about amino acids 70 to 75. In other 
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specific embodiments, contemplated NOV6 epitopes are from about amino acids 85 to 90, 92 
to 98, 1 10 to 1 15, 130 to 139 and 148 to 150. 



NOV7 

A disclosed NOV7 nucleic acid (alternatively referred to herein as CG56977-01) 
encodes a novel Neuropathy Target Esterase/Swiss Cheese Protein-like protein and includes 
the 4718 nucleotide sequence (SEQ ID NO:21) shown in Table 7A. An open reading frame 
for the mature protein was identified beginnmg with an ATG codon at nucleotides 1-3 and 
ending with a ATC codon at nucleotides 4258-4260. Putative untranslated regions are 
underlined in Table 7A, and the start and stop codons are in bold letters. 



Table 7A. NOV7 Nucleotide Sequence (SEQ ID NO:21) 

ATGGAGGAAGAGAAAGATGACAGCCCACAGCTGACGGGGATTGCAGTTGGAGCCCTCCTGGCC 
TGTCCTCATCCTTTTCATGTTCAGAAGGCTTAGACAATTTCGACAAGCACAGCCCACTCC^ 

AGAGAGACAAAGTGATGTTTTACGGCCGGAAGATCATGAGGAAGGTGACCACACTCCCCAACACCCTTGTGGAGAAC 

ACTGCCCrGCCCCGGCAGCGGGCCAGGAAGAGGACCaAGGTGCTGTCTTTGGCC?UlGAGGATTCTG 

GGAATACCCGGCCCTGCAGCCCAAGGAGCCCCCGCCCTCCCTGCTGGAGGCCGACCTCACGGAGTTTGACGTGAAGA 

ATTCTCT^CCTGCO^TCGGAAGTTCTGTACATGCTGAAAAACGTTCGGGTCCTGGGCCACTTTGAGAAGCCGCTGTTC 

CTGGAGCTTTGCAAACACATCGTCTTTGTGCAGCTGCAGGAAGGGGAGCACGTCTTCCAGCCCAGGGAG^ 

C^CaTCTGTGTGGTGCAGGACGGGOSGCTGGAGGTCTGCATCCAGGACAC 

AGGTTCTGGCGGGAGACAGCGTCCACAGCCTGCTCAGCATCCTGGACATCATCACCGGCCATGCTGCACCTTACAAA 
ACGGTCTCCGTCCGCGCGGCCATCCCGTCCACCATCCTCCGGCTTCCAGCTGCGGCTTTTCATGGAGTTTTTGAGAA 
ATATCCGGAAACTCTGGTGAGGGTGGTGCAGTTGCAGATCATCATGGTGCGGCTGCAGAGGGTGACCTTTCTGGCTC 
TGCACAACTACCTCGGCCTGACCACAGAGCTCTTCAACGCTGAGAGCCAGGCCATCCCTCTCGTGTCTGTAGCCAGT 
GTGGCTGCCGGGAAGGCCAAGAAGCAGGTGTTCTATGGCGAAGAAGAGCGGCTTAAAAAGCCACCGCGGCTCCATGA 
GTCCTGTGACTCAGCAGATCACGGGGGCGGCCGCCCGGCAGCTGCTGGGCCCCTGCTGAAGAGGAGCCACTCCGTCC 
CCGCGCCTTCCATTCGGAAACAGATCTTGGAGGAGCTGGAGAAGCCCGGGGCAGGTGACCCTGACCCTTCGGCCCCA 
CAAGGGGGCCCAGGCAGTGCCACTTCTGATCTGGGGATGGCATGTGACCGTGCCAGGGTCTTCCTGCACTCGGACGA 
GCACCCCGGGAGCTCCGTGGCCAGCAAGTCCAGGAAAAGCGTGATGGTTGCAGAGATACCCTCCACGGTCTCCCAGC 
ACTCAGAGAGTCACACGGATGAGACCCTGGCCAGCAGGAAGTCGGATGCCATCTTCAGAGCTGCCAAGAAGGACCTG 
CTCACCCTGATGAAGCTGGAAGACTCATCTCTGTTGGATGGCCGGGTGGCGCTTCTGCACGTTCCTGCATGCACGGT 
GGTGTCAATGCAGGGAGACCAAGACGCCAGCATCCTGTTCGTTGTCTTGGGGCTGCTGCACGTGTACCAGCGGAAGA 
TCTGCAGCCAGGAGGACACCTGCTTGTTCTCACGCGCACCCGGGGACTCATCTCTGTTGGATGGCCGGGTGGCGCTT 
CTGCACGTTCCTGCAGGCACGGTGGTGTCAAGGCAGGGAGACCAGGACGCCAGCATCCTGTTCGTGGTCTCGGGGCT 
GCTGCACGTGTACCAGCGGAAGATCGGCAGCCAGGAGGACACCTGCTTGTTCCTCACGCGCCCCGGGGAGATGGTGG 
GCCAGCTGGCCGTGCTCACCGGGGAGCCTCTCATCTTCACCGTCAAGGCCAACAGGGACTGCAGCTTCCTGTCCATC 
TCCAAGGCCCACTTCTATGAAATCATGCGGAAGCAGCCGACCGTCGTCCTGGGTGTGGCGCACACTGTGGTGAAGAG 
GATGTCGTCCTTCGTGCGGCAAATCGACTTTGCCCTGGACTGGGTGGAGGTGGAGGCCGGGCGAGCAATATACAGGC 
AGGGGGACAAGTCCGACTGCACGTACATCATGCTCAGCGGCCGGCTGCGCTCTGTGATCCGGAAGGATGATGGGAAG 
AAGCGCCTGGCCGGGGAGTACGGCCGAGGAGACCTCGTCGGCGTGGTGGAGACACTGACCCACCAGGCCCGGGCGAC 
CACGGTGCATGCCGTTCGGGACTCAGAATTGGCCAAGCTGCCGGCAGGAGCCCTCACGTGCATCAA.GCGCAGGTACC 
CACAGGTGGTGACTCGGCTGATTCAT CTCTTGGGTGAGAAGATCCTGGGCAGCCTCCAGCAGGGACCTGTGACAGGC 
CACCAGCTTGGGCTCCCCACGGAGGGCAGCAAGTGGGACTTGGGGAACCCGGCTGTCAACCTGTCCACGGTGGCAGT 
GATGCCCGTGTCAGAGGAAGTGCCCCTCACCGCCTTCGCCCTGGAGCTGGAGCATGCCCTCAGCGCCATCGGCCCGC 
CCCTGCTGCTGACTAGTGACAACATAAAACGGCGCCTTGGCTCCGCTGCCCTGGACAGTGTTCACGAGTACCGGCTG 
TCCAGCTGGCTGGGGCAGCAGGAGGACACCCACAGGATCGTGCTCTACCAGGTAGATGGCACGCTCACACCCTGGAC 
CCAGCGCTGCGTGCGCCAGGCCGACTGCATCCTCATCGTGGGCCTGGGTGACCAGGAGCCCACAGTGGGCGAGCTGG 
AGCGGATGCTGGAGAGCACAGCTGTGCGTGCCCAGAAGCAGCTGATCCTGCTGCACAGGGAGGAGGGCCCGGCGCCA 
GCGCGCACCGTGGAGTGGCTCAACATGCGGAGCTGGTGCTCCGGCCACCTGCACCTCTGCTGCCCGCGCCGCGTCTT 
CTCCAGGAGGAGCCTGCCCAAGCTGGTGGAGATGTACAAGCATGTCTTCCAGCGGCCCCCGGACCGACACTCAGACT 
TCTCCCGCCTGGCGAGGGTGCTGACGGGCAACGCCATTGCCCTGGTGCTTGGGGGAGGGGGAGCAAGCATGACGTCC 
TTGATGAAGGCCGCGCTGGACCTCACCTACCCCATCACGTCCATGTTCTCCGGAGCCGGCTTCAACAGCAGCATCTT 
CAGCGTCTTCAAGGACCAGCAGATCGAGGACCTGTGGATTCCTTATTTCGCCATCACCACCGACATCACAGCCTCGG 
CCATGCGGGTCCACACCGACGGCTCCCTGTGGTGGTACGTGCGTGCCAGCATGTCCCTGTCCGGTTACATGCCCCCT 
CTCTGTGACCCGAAGGACGGACACCTGCTGATGGACGGGGGCTACATCAACAACCTCCCAGCTGCCTCCGCTCCAAG 
AAGCCTGGGCTGGAACACGTTTTCCTTAGAGTATGCCAAGGGAAAATGTCAGGCTGGCATCAGAGCTCCGAGAAC^ 
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GCACACGCGTGTACATGCACACGCAGGCACCGGCAGCATGTGCTCCAGCATATGGCCCTGTTTC^ 

ATGCAGAACAAAGGCCAAGTCGAGGAACTGGGAGCAATTAAGCCCCATCTGTGCCCACAGTCAGAAACTAACAGCCT 

GCAGGGGGTAACCAGGGCTGGCTTCTCCCTAGCGGATGTGGCCCGGTCCATGGGGGCAAAAGTGGTGATCGCCATTG 

AO^TGGGCAGCCGAGATGAGACGGACCTCACCAACTATGGGGATGCGCTGTCTGGGTGGTGGCTGCTGTGGAAACGC 

TGGAACCCCTTGGCCACGAAAGTCAAGGTGTTGAACATGGCAGAGATTCAGACGCGCCTGGCCTACGTGTGTTGCGT 

GCGGCAGCTGGAGGTGGTGAAGAGCAGTGACTACTGCGAGTACCTGCGCCCCCCCATCGACAGCTACAGCACCCTGG 

ACTTCGGCAAGTTCAACGAGATCTGCGAAGTGGGCTACCAGCACGGGCGCACGGTGTTTGACATCTGGGGCCGCAGC 

GGCGTGCTGGAGAAGATGCTCCGCGACCAGCAGGGGCCGAGCAAGAAGCCCGCGAGTGCGGTCCTCACCTGTCCCAA 

CGCCTCCTTCACGGACCTTGCCGAAATTGTGTCTCGCATTGAGCCCGCCAAGCCCGCCATGGTGGATGACGAATCTG 

ACTACCAGACGGAGTACGAGGAGGAGCTGCTGGACGTCCCCAGGGATGCATACGCAGACTTCCAGAGCACCTCAGCC 

CAGCAGGGCTCAGACTTGGAGGACGAGTCCTCACTGCGGCATCGACACCCCAGTCTGGCTTTCCCAAAACTGTCTGA 

GGGCTCCTCTGACCAGGACGGGTAGAGGCCTCTGCTAAAGAGCCCGGATGCAGCGTCTTCCGTGGGACTGTCCCCAA 

GGCTGAGGCTCCTGCCAAGTCCTAGGGGCCTCTGTACCTGCCCTGCTGGAAGCCCTGACTTCCCCGGGGCCCCAGGC 

TGTGTTAGGGTTCTCTGGGCCTCTTCTTTGTACCAGCAGCCCTGCATACAGGGCCCTGTGAGCCCCCCTGCAGTCCT 

GTGAGGCCCCTOAAGCTCTGTGAGGCCCCTGAAGCTCTGTGAACCCCCTGCAGCCCTGTGAGGCCCCCCG^^ 

GTGAGGCCCCCCGAAGCCCTGTGAACCACCTGCTGCCCTGTGAGGCCCCCAAAGCCCTGTGAACTGCCTGCTGTC 

GTGAACTGCCTGCTGCCCTGTGAGGTGTGGGAGCCCTGATGCTGCCGTGTG^ 

GTTGAAAAAAAAAAAAAAAAA — 



The nucleic acid sequence of NOV7 maps to chromosome 9 and invention has 1 104 
of 1504 bases (73%) identical to a gb:GENBANK-ID:HSAJ48321acc:AJ004832.1 mRNA 
from Homo sapiens (Homo sapiens mRNA for neuropathy target esterase) (E = 0,0). 

A disclosed NOV? polypeptide (SEQ ID NO:22) is 1419 amino acid residues in 
length and is presented using the one-letter amino acid code in Table 7B. The SignalP, Psort 
and/or Hydropathy resuhs predict that NOV7 has a signal peptide and is likely to be localized 
to the endoplasmic reticulum (membrane) with a certainty of 0.8200. In alternative 
embodiments, a NOV7 polypeptide is located to the nucleus with a certainty of 0.2400, the 
plasma membrane with a certainty of 0.1 900, or the endoplasmic reticulum (lumen) with a 
certainty of 0.1000. The SignalP predicts a likely cleavage site for a NOV7 peptide between 
amino acid positions 38 and 39, z.e. at the sequence LRQ-FR. 



Table 7B. Encoded NOV7 Protein Sequence (SEQ ID NO:22) 



MEEEKDDSPQLTGIAVGALIiAIJ^VGVLILFMFRRLRQFRQAQPTPQYRFRKRDKV^ 

entai^rqraiocrtkvlsijucrilrfkkeypalqpkepppslleadltefdvknshlpsevl™ 
kplfiiblckhivfvqiiqegbhvfqprepdpsicwqixsrlevciqdtdgtevvvkevlagdsvhs 

HAAPYKTVSVRAAIPSTILRLPAAAFHGVFEKYPETLVRWQLQIIiyrTOLQRVTFI^ 

IPLVSVASVAAGKAKKQVFYGEEERLKKPPRLHESCDSADHGGGRPAAAGPLLKRSHSVPAPSIRKQILEELEKP 

GAGDPDPSAPQGGPGSATSDIlG^mCDRARVFIJ^SDEHPGSSVASKSRKSVIyrVAEIPSTVSQHSESHTDETI^ 

SDAIFRAAKKDLLTIJflKIiEDSSIilXSRVALLHVPACTWSMQGDQDM 

APGDSSLIJ)GRVALIiHVPAGTWSRQCa)QDASILFWSGLLIIVYQRKIGSQEDTCLFLTRPGEMVGQ 

LIFTVKANRDCSFLSISKAHFYEIMRKQPTVVLGVAHTWKRMSSFWQIDFAIiDWVE^ 

YIMLSGRLRSVIRKDDGKKRIAGEYGRGDLVGVVETLTHQARATTVHAVRDSEIJ^PAGALTCIia^ 

IiIHLLGEKII^SLQQGPVTGHQIiCTiPTEGSKWDIXSNPAVNLSTVAVMPVSEEV^ 

TSDNIKRRLGSAAIJ^SVHEYRLSSWLGQQEDTHRIVLYQVDGTnTPWTQRCVRQADCILI^^ 

MI^STAVRAQKQLIIJ^REEGPAPARTVEWIJ)JMRSWCSGHLHL 

FSRLARVI.TGMAIALVIiGGGGASMTSIjMKAAI*DLTyPITSMFSGAGFNSSIFSVFK^ 
ASAMRVHTDGSLWWYVRASMSLSGYMPPIiCDPKIX3HLL^C^ 

APRTCTRWMHTQAPAACAPAYGPVCQLSSMQNKGQVEELGAIKPHLCPQSETNSLQGVTRAGFSI^VARSM 
KWIAIDVGSRDETDLTNYGDAI,SGWmiWKRWNPlATKVKVI.^n^lAE 

PIDSYSTIJ>FGKFNEICEVGyQHGRTVFDIWGRSGVLEKMLRI>QQGPSKKPASAVI.TCPNASFTO 
AKPAMVDDESDYQTEYEEEIJiDVPRDAYADFQSTSAQQGSDIJEX>ESSIiRHRHPSIJ^PKLSEG 
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The NOV7 amino acid sequence was found to have 349 of 507 amino acid residues 
(68%) identical to, and 407 of 507 amino acid residues (80%) similar to, the 1327 amino acid 
residue ptnr:SPTREMBL-ACC:Q9Rl 14 protein from Mus musculus (Mouse) 
(NEUROPATHY TARGET ESTERASE HOMOLOG) (E - 0.0). 

NOV7 is expressed in at least the following tissues: blood, tonsil, lung tumor, and 
prostate (normal). Expression information was derived from the tissue sources of the 
sequences that were included in the derivation of the sequence of NOV7. The sequence is 
predicted to be expressed in the following tissues because of the expression pattern of 
(GENBANK-ID: gb:GENBANK-ID:HSAJ4832|acc:AJ004832.1) a closely related Homo 
sapiens mRNA for neuropathy target esterase homolog in species Homo sapiens: bone, brain, 
breast, germ cell, heart, kidney, lung, pancreas, pooled, prostate, testis, tonsil, uterus, whole 
embryo, amnion -normal, brain, breast, colon, head, neck, kidney, lung, placenta, prostate- 
normal, skin, and uterus. 

Possible small nucleotide polymorphisms (SNPs) found for NOV7 are listed in Table 

7C. 



Table 7C: SNPs 


Variant 


Nucleotide 
Position 


Base Change 


Amino Acid 
Position 


Base Change 


13375546 


707 


G>A 


236 


Arg>His 


13376992 


3984 


OG 


NA 


NA 



NOV? also has homoios^ to the amino acid sequences shown in the BLASTP data 
listed in Table 7D. 



Table 7D. BLAST results for NOV? 


Gene Index/ 
Identifier 


Protein/ 
Organism 


Length 
(aa) 


Identity 
(%) 


Positives 
(%) 


Expect 


gi 1 7657401 | ref |NP_0 
56616. l| 
{l!3M_015801) 


neuropathy- 
target 

esterase; Swiss 
cheese [Mus 
musculus] 


1327 


650/1174 
(55%) 


779/1174 
(65%) 


0.0 


gi 1 16550716 |dbj |BAB 
71033. l| (AK055880) 


unnamed protein 
product [Homo 
sapiens] 


702 


420/483 
(86%) 


421/483 
(86%) 


0.0 
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gi 1 17530839 |ref|NP_ 
511075. l| 
(KM_078520) 


Swiss cheese; 
olfactory E 
{Drosophila 
mel anogast er] 


1425 


447/1112 
(40%) 


624/1112 
(55%) 


0.0 


gi 1 7290863 |gb|j^46 
305. l| (AE003442) 


sws gene 
product 
[Drosophila 
melanogasterj 


1389 


446/1111 
(40%) 


623/1111 
(55%) 


0.0 


gi 1 5729951 1 ref |NP_0 

06693 .l| 
(NM 006702) 


neuropathy 
target esterase 
[Homo sapiens] 


1327 


272/548 
(49%) 


351/548 
(63%) 


e-122 



The homology of these sequences is shown graphically in the ClustalW analysis 
shown in Table 7E. 



Table 7E. ClustalW Analysis of NOV7 



1) 


NOV7 


(SEQ 


ID 


NO: 


22) 


2) 


gi 


7657401 1 


(SEQ 


ID 


NO: 


172) 


3) 


gi 


16550716] 


(SEQ 


ID 


NO: 


173) 


4) 


gi 


1753083 9] 


(SEQ 


ID 


NO: 


174) 


5) 


gi 


7290863 1 


(SEQ 


ID 


NO: 


175) 


6) 


gi 


5729951 1 


(SEQ 


ID 


NO: 


176) 
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gi 1 17530839 I 
gi 17290863 I 
gx 1 57299511 



NOV7 

gi|765740l| 
gi 116550716 I 
gi I 175308391 
gi I 7290863 I 
gi I 5729951] 



NOV7 

gi 1 7657401 1 
gi 1 16550716 I 
gi |17530839| 
gi [7290863 I 
gi 157299511 



N0V7 

gi|765740ll 
gi|l6550716t 
gi 117530839 1 
gi|7290863 | 
gi 1 5729951 1 



N0V7 

gi 17657401 1 
gi jl6550716| 
gi j 17530839 I 
gi 17290863 I 
gi 1 5729951 1 



NOV7 

gi|765740l| 
gi|16550716| 
gi 1 17530839 I 
gi 1 7290863 | 
gi 1 5729951 1 



NOV7 



76574011 
165507161 
17530839 I 
7290863 I 
5729951 




370 380 
|....|....| 
321 GEEERLKKPPRLHESCDSAD] 
317 |gSKRWSTSGTEDTS - KETS| 

1 r 

347 iSPNGPPMVISQMNLMQSAVSj 
311 jAPNGPPMVISQMNIiMQSAVS 
317 gGSKRMVSTSATDEP - - RETPj 



;KPGAGD 379 
CISMP 375 
1 



NOV7 

gi|765740l| 
gi I 165507161 
gi 1 17530839 I 
gi [7290863 I 
gi I 57299511 



N0V7 

gi I 7657401 1 
gi I 16550716 | 



430 



440 



rCSSGVS' 

igssgvs 

RPPDPTGAPpPi 
450 




DP N 399 

DP N 363 

RCVSMP 374 



460 



470 



480 



380 
376 



pIpsap* 



. ^ .CDRA^FLHSDEHPGSSVA^KSR] 
.Y]^GRIs|sLE^ASGGPQTA§PBr] 





490 500 510 

440 SHTDETLASRKSDAIFRAAKKDLLTLMKLEDSS: 
433 YCEDES: 



iHSVGNLgrgRSilTpmPDP S\ 

fHSVGKIi§rSRs|lTp4APDG S| 

ggsiaapaStpIqepreqpagaci ' 

530 




540 



520 

LVALLHVPACTWSMQGDQE^I 499 

CPFGPYQGRQt|BI 457 

1 



455 
419 
432 




-VTTSIDMRLVQ^A 475 
-VTTSIDMRLVQ^A 439 
-CPFGPYQGRQlHl 456 



600 



500 LFWtiGLfflHVYQRKI CSQEDTCLFSRAPGjj 
458 fSa^^O^R l' 
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gi |X7530839| 
gi 17290863 I 
gi [5729951 I 



KOV7 

gi [7657401 1 
gi 116550716 I 
gi j 17530839 I 
gi [7290863 I 
gij 57299511 



N0V7 

gi [7657401 1 
gi|l6550716[ 
gij 17530839 I 
gi [7290863 I 
gi [5729951 1 



NOV7 

gi [7657401 I 
gij 16550716 [ 
gi|l7530839i 
gi [7290863 | 
gi [5729951) 



N0V7 

gi [7657401 1 
gi [16550716 | 
gij 17530839 I 
gi 17290863 | 
gij 5729951 I 



N0V7 

gi [7657401| 
gi jl6550716| 
gi il7530839| 
gi 17290863 I 
gi [5729951 1 



N0V7 

gi [7657401 [ 
gi [16550716 I 
gi [17530839 I 
gij 7290863 I 
gi j 5729951 [ 



N0V7 

gi I 7657401 ^ 
gij 16550716 I 
gij 17530839 I 
gij 7290863 [ 
gij 5729951 j 



N0V7 

gi [7657401 1 
gi [16550716 I 




TYIVLSGRJRSV: 

tyivlsgrIrsv: 
tyivl!?grlrsv: 




790 



800 



810 



820 



830 



840 



I 



I KRRYPQ WTRL I HLL^KI LG SLG 
IKRRYPQWTRLIHLLSiKILGlLq 
I KRRYPQWTRL IHLL^glLGSLq 

IK|iRYpivVT|LI^LsH |lGs|g 




850 



860 



870 



880 



890 



789 
732 
112 

NPVTH 750 
NPVTH 714 
[S 732 

900 




1150 



1160 



1180 



1190 



1200 



1036 

1032 

412 

1051 

1015 

1032 



EDLWIPYFgjTTDITASiyMRVHTDGSLWgYVRASMSLSGYT^PPLCDPKDGHLI^MDGGYI]^ 
EDLV^PYF^TTDITASAMRVTiigDGSLvNTlYVRASMlLSGY'flPPLCDPKDGHLLMDGGYIN 
EDLWIPYFgBTTDITASAMRVTiTDGSLWYVRASMSLSGYMPPLCDPKDGKLKMDGGYI 
SDLWIPYFOTTTDITASgRMHTlGSLWRYVRlSMSLSGYT^IPPLCDPKDGHLDl^ 

EBLWI ? YfJHtTD I T AsS^f^^S^' ^^'''"'^^^^'^ ^^^"^^^ "^^^^^ 

EDLlAlPYFBTTDITASAMRVHfCGSLWRYVRASMlLSGYiPPLCDPKDGHLU^ 



1095 

1091 

471 

1110 

1074 

1091 



1210 1220 1230 1240 1250 1260 

|....|....l....[....|....h...|-...|....|. ...[-.-. [....[ 
1096 ^gAASAPRSLGWNTFSLEYAKGKCQAGIRAPRTCTRVYMHTQAPAACAPAYGPVCQLSS 1155 

1092 1094 

472 iSi 474 
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gi 1 17530839 1 
gi [7250863 [ 
gi 1 5729951 1 



N0V7 

gi I 7657401 1 
gij 16550716 I 
gi|l7530839| 
gi 17290863 1 
gij 5729951 1 



NOV7 

gi 1 7 657401 1 
gij 165507161 
gij 17530839 1 
gij 7290863 I 
gij 57299511 



1111 gg| 1113 

1075 1077 

1092 SlB 1094 

1270 1280 1290 1300 1310 1320 

.... I .... I i .... I .... 1 .... 1 III! I 

1094 ^^^^^^r^^^^^^^ffl 1118 

474 m|^9^^v|^^^^^^^ffl 498 

1113 ^SS^^HrH^^^^^S^M 1137 

1077 ^ ^MI^^ SaI^^^^^^S^ 1101 

1094 ^^^jg^MS S rSBBS^SaM^ iiis 

1330 1340 1350 1360 1370 1380 

I [ ! | . . ..| I I I I [ I [ 

1216 1275 

1119 ^^^^^^BS^^^^^'^&&^^^^^gS^iSSSl^ ^^'^^ 

499 ™|^^i^^SwgBgl[g^ffi 558 

1138 ^^^^^^^^^^^^pTi^3g!p5^^S^^E^^^E^B^^^^B^B 1^57 

1119 l^lBSyi^^mfeiSsbja^gBgfljBSSigaffiS^ 1178 



13 90 



1400 



1410 



1420 



1430 , 



1440 



N0V7 


1276 


gi 


7657401 1 


1179 


gi|l655071S| 


559 


gij 17530839 1 


1198 


gi 


7290863 ( 


1162 


gi 


57299511 


1179 


N0V7 


1334 


gi 


7657401} 


1239 


gi 


165507161 


617 


gi 117530839 1 


1255 


gi 


7290863 1 


1219 


gi 


57299511 


1239 


NOV7 


1359 


gi 


7 657401 1 


1269 


gi 1 16550716 j 


642 


gi 


175308391 


1315 


gij 7290863 | 


1279 


gi 


5729951 1 


1269 


NOV7 


1414 


gi 


7657401] 


1323 


gi 


16550716 [ 


697 


gi 


175308391 


1375 


gi 


7290863 1 


1339 


gi 


57299511 


1323 




1460 



1470 



1480 



1500 




.1- 



1490 

.|....t....| 

1359 

1269 

642 

DGYI SEPTTLNTDRRRI QVSRAGNSLS 1314 
CDGYISBPTTLJJTDRRRIQVSRAGNSLS 1278 
1269 



1540 



1550 




1570 



1580 



1590 



1600 



1610 




1. 



1419 

1327 

702 

KTQTGQEQELQQEQQDQGATAEQLVDKDKEENKENRSSPISINBTKN 1425 
'KTQTGQEQELQQEQQDQGATAEQLVDKDKEENKENRSSPNNETKN 13 89 
1327 



Tables 7F and 7G list the domain description from DOMAIN analysis results against 
NOV7. NOV7 shows similarity to an uncharacterized protein family and, at several 
positions, to a cyclic nucleotide binding domain/cyclic nucleotide monophosphate binding 
domain. This indicates that the NOV? sequence has properties similar to those of other 
proteins known to contain these domains. 
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Table 7F. Domain Analysis of NOV7 

giil|Pf am|pfam01l73, UPF0028, Uncharacterizeci protein family UPF0028. 
Ca>- Length ~ 317 residues, 91.2% aligned 
bits (416) , Expect 2e-41 

PDRHSDFSRLARVLTGNAIALVLGGGGA SMTSLMKAAIDLTYPITSMFSGAGFKSSI 1026 

lllllin + liim Illlllli + II + I I + 

lAFQSDFSRIARILTGNAIGLVLGGGGARGAAHIGVIQALKEVGIPI -DIVGGTSIGSLV 62 

FSVFKDQQIEDLWI PYFAITTDITASJ^VHTDGSLWWYVRASMSLSGYMPPLCDPKDGH 10 86 

+++ II I 1^ I 1+ + I 1+ I 

GftLY ACDPDSVLV DARAKWFFSGSSSIWDRLMDLTWPRSG- 102 

LLMDGGYINlSLPAASAPRSLGVmTFSLEYAKGKCQAGIRAPRTCTRVyMHTQAPAACA- P 114 5 

-LLTGHRFNRQVQEIFGETLIED-CWRSFFCVSTDLSTSRQRIHREGDLWLAIRASMSIA 160 

AY-GPVCQLSSMQNKGQVEELGAIKPHLCPQSBTNSLQGVTRAGFSIADVARSMGAKWI 1204 

im ^ I 1 + 1 ill K+ll +11 

GLLPPVCQNGHLLLDGGY VNNLP ADVMRALGADIVI 196 

AIDVGSRDETDLTirZGDAIiSGWWriLWKRWNPIiATKVKVLKMAEIQTI^ 12 64 

l+llil I hi II +111 l+klllll ++++111 + 111 tliil 111 II 
AVDVGSADI*TNIJDLYGFSI,SGEWII,FKRVmPFGARIJlII*KMSEIQRRlAyVPa^^ 256 

KSSDYCEYLRPPIDSYSTLDFGKFNEICEVGYQHGR 1300 (SEQ ID NO: 177) 
I++ I 111+ II+++ lilllMI ++I + + 

KNTTVYCRYLKRPIEAFDTIxDFSKFPEIPQIGVIiYFK 292 (SEQ ID NO: 178) 



Score = 


164 


NOV 7: 


970 


Sbjct: 


4 


NOV 7: 


1027 


Sbjct: 


63 


NOV 7: 


1087 


Sbjct: 


103 


NOV 7: 


1146 


Sbjct: 


161 


NOV 7: 


1205 


Sbjct: 


197 


NOV 7: 


1265 


Sbjct: 


257 



NOV 7: 


653 


Sbjct: 


1 


NOV 7: 


713 


Sbjct: 


61 



Table 7G. Domain Analysis of NOV? 

gnltPf ara|pfara00027, cNMP_binding, Cyclic nucleotide-binding domain. 
CD-Length = 94 residues, 100*0% aligned 
Score = 78.6 bits (192), Expect = 2e-l5 

ALDWVEVEAGRAIYRQGDKSDCTYIMLSGRLRSVIRKDDGKKRLAGEYGRGDLVGVVETL 712 

11+ II I till I II++II + +II++++ I Mil 1 + I 

ALBERS YPAGEVI IRQGDPGDSLYI WSGSVEVYRLLEDGREQIVGTLGPGDLFGELALIi 6 0 

THQARATTVHAVRDSELAKLPAGALTCIKRRYPQ 746 (SEQ ID NO r 179) 
1+ I II 1+ I II +1 + +11+ 

TNPPRTATVRALTDCEI*I*RIJ)REDFERLIjEQYPE 94 (SEQ ID NO: 180) 

gnl|Pfam[pfam00027, cNMP__binding, Cyclic nucleotide-binding domain. 
CD-Length = 94 residues, 93.6% aligned 
Score ^ 76.6 bits (187), Ejqpect = 96-15 

HVPAGTVVSRQGDQDASILFVVS6LLHVirQRKIGSQEDTCI.FI.TRPGEMV6QLAVLTGEP 600 

III 1+ lin 1+ lilt + 11+ +1 1 II++ l + ll + ll I 

SYPAGEVI IRQGDPGDSItYIWSGSVEVYRLLEDGREQIVGTL- GPGDLFGELALLTNPP 64 

LIFTVKAKRDCSFLSISKAHFYEIMRKQP 629 (SEQ ID NO: 181) 

M+l i! I + + I ++ + I 
RTATVRALTDCELLRLDREDFERLLEQYP 93 (SEQ ID NO: 182) 

gnl|PfaTn|pfam00027, cNMP_binding , Cyclic nucleotide-binding domain. 
CD- Length = 94 residues, 100.0% aligned 
Score = 64.3 bits (155), Expect = 4e-ll 

HIVPVQLQBGEHVPQPREPDPSICWQDGRLEVCIQDTDGTEVWKEVLAGDSVHSLLSI 219 

II ++ +1 1+ +1 I +11 II I +t + II I + 

ALEERSYPAGEVI IRQGDPGDSLYI WSGSVEVYRLLEDGREQI VGTLGPGDLPGBLALL 6 0 

LDIITGHAAPYKTVSVRAAIPSTILRLPAAAFHGVFEKYPE 260 (SEQ ID NO: 183) 

I +1 +111 +111 I + l+lll 

TN PPRTATVRALTDCELLRLDREDFERLLEQYPE 94 (SEQ ID NO: 184) 
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NOV 7: 


541 


Sbjct: 


6 


NOV 7: 


601 


Sbjct: 


65 



NOV 7: 


160 


Sbjct: 


1 


NOV 7: 


220 


Sbjct: 


61 



gnl|Smart| smartOOlOO, cNMP, Cyclic nucleotide -monophosphate binding domain ; 
Catabolite gene activator protein (CAP) is a profcaryotic homologue of eukaryotic 
cNMP-binding domains, present in ion channels, and cNMP- dependent kinases. 

CD- Length = 121 residues, 94.2% aligned 

Score « 66.2 bits (160), Expect = le-11 

NOV 7:645 SFVRQIDFALDWVEVEAGRAIYRQGDKSDCTYIMLSGRIaRSVIRKDDGKKKLAGEYGRGD 704 

+ h+ 11+ I II Mill I Ih+ll + +IK'^++ I III 

Sb j Ct : 8 EELRBIADMJ3PVRYPJVSEVIIRQGDVGDSFYIXVSGEVEVYKTLBa3REQII/5^ 67 

NOV 7:705 LVGVVETLTHQARATTVHAVRDSEIJ^KLPAGaLTCIKRRYPQVV^ 759 {SEQ ID 

NO:185) 

I + II++ II ^ I iini + k+ 1+ II I 

Sbjct:68 FPGBIALL3WiaiRARSA-AAVM»BIJ4JCIJJRIDFRDFI,QIi 121 (SEQ ID 

NO: 186) 

gnl I Smart I smart 001 DO, CNMP, Cyclic nucleotide -monophosphate binding domain; 
Catabolite gene activator protein (CAP) is a prokaryotic homologue of eukaryotic 
cNMP-binding domains, present in ion channels, and cNMP- dependent kinases . 

CD- Length « 121 residues, 97.5% aligned 

Score =* 63.9 bits (154), Expect = 6e-ll 

VLGHFEKPLFLELCKHIVFVQLQBGEHVFQPREPDPSI CWQDGRLEVCI QDTDGTEVW 204 

+ + II +1+ II + + + I I +tl 11 I ++ 

LFKALDAEBLRELADALBPVRYPAGEVIIRQGDVGDSPYIIVSGEVEVYKTLEDGREQIL 60 

KBVLaGDSVHSLLSILDIITOIAAPYKTVSVRAAI PSTILRLPAAAFHGVFEKYPETLVR 264 

GTIiGPGDFF GELALLTNRRRAR-SAAAVALELAKLLRIDFRDFLQLLPEIPQLLLE 115 



NOV 7: 


145 


Sbjct: 


1 


NOV 7: 


205 


Sbjct: 


61 


NOV 7: 


265 


Sbjct: 


116 



gnl I Smart j smart 001 00 , cNMP, Cyclic nucleotide -monophosphate binding domain; 
Catabolite gene activator protein (CAP) is a prokaryotic homologue of eukaryotic 
cKMP -binding domains, present in ion channels, and cNMP- dependent kinases. 

CD-Length =s 121 residues, 74.4% aligned 
Score = 55.1 bits (131), Expect = 3e-0a 

NOV 7: 541 HVPAGTVVSRQCaDQDASILFVVSGLLHVYQRKIGSQEDTCLFLTRPGEMVGQLAVLTGE- 599 

III h tin I +111 + 11+ + +1 11+ l + ll + ll 
Sbjct: 21 RYPAGEVIIRQGDV6DSFYIIVSGBVEVYKT-LEDGREQILGTLGPGDFFGELALLTNRR 79 

NOV 7: 600 -PLIFTVKANRDCSFLSISKAHFYEIMRKQP 629 (SEQ ID NO: 189) 

I II 1 +++ + I 
Sbjct: 80 RflRSARAVALELAKLLRIDFRDFLQLLPEIP 110 (SEQ ID NO: 190) 



Uncharacterized protein family UPF0028 (Interpro IPR001423): A number of 
prokaryotic and eukaryotic uncharacterized proteins belong to this family. These proteins are 
of variable size and share a glycine-rich domain of about 200 residues that is located at the C- 
terminus of the eukaryotic members of this family. 

Cyclic nucleotide-binding domain (Interpro IPR000595): Proteins that bind cyclic 
nucleotides (cAMP or cGMP) share a structural domain of about 120 residues. The best 
studied of these proteins is the prokaryotic catabolite gene activator (also known as the cAMP 
receptor protein) (gene crp) where such a domain is known to be composed of three alpha- 
helices and a distinctive eight-stranded, antiparallel beta-barrel structure. There are six 
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invariant amino acids in this domain, three of which are glycine residues that are thought to 
be essentia] for maintenance of the structural integrity of the beta-barrel. cAMP- and cGMP- 
dependent protein kinases (cAPK and cGPK) contain two tandem copies of the cyclic 
nucleotide-binding domain. The cAPK's are composed of two different subunits, a catalytic 
chain and a regulatory chain, which contains both copies of the domain. The cGPK's are 
single chain enzymes that include the two copies of the dom^n in their N-terminal section. 
Vertebrate cyclic nucleotide-gated ion-channels also contain this domain. Two such cations 
channels have been fully characterized, one is found in rod cells where it plays a role in 
visual signal transduction. 

The novel protein of the invention is similar to Neuropathy Target Esterases and 
Swiss Cheese proteins and therefore is likely to share some of their properties which are 
described below. Covalent modification of Neuropathy Target Esterase (human NTE) by 
certain organophosphorus esters (OPs) leads, after a delay of several days, to a degeneration 
of long axons in the spinal cord and peripheral nerves (organophosphate-induced 
neuropathy). The active-site serine of NTE lies in the center of a predicted hydrophobic helix 
within a 200-amino-acid C-terminal domain with mariced similarity to conceptual proteins in 
bacteria, yeast and nematodes; these proteins may comprise a novel family of potential serine 
hydrolases, 

NTE shares 41% amino acid sequence identity with the Drosophila 'Swiss Cheese' 
(Sws) protein, which is involved in the regulation of interactions between neurons and glia in 
the developing fly brain. Swiss cheese (sws) mutant flies develop normally during larval life 
but show age-dependent neurodegeneration in the pupa and adult and have reduced life span. 
In late pupae, glial processes form abnormal, multilayered wrappings around neurons and 
axons. Degeneration first becomes evident in young flies as apoptosis in single scattered ceils 
in the CNS, but later it becomes severe and widespread. In the adult, the number of glial 
wrappings increases with age. The sws gene is expressed in neurons in the brain cortex. It is 
suggested that the novel SWS protein plays a role in a signaling mechanism between neurons 
and glia that regulates glial wrapping during development of the adult brain. 

The observation that the Swiss Cheese protein when mutated, leads to widespread cell 
death in Drosophila brain, suggests that genetically altered NTE, because of its homology to 
Swiss cheese protein may be involved in human neurodegenerative disease. The murine 
sws/NTE gene is 96% identical to NTE. During development the Msws transcript is 
expressed in the embryonic respiratory system, different epithelial structures and strongly in 
the spinal ganglia. Postnatally, Msws mRNA is expressed in all brain areas, with an 
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increasingly restrictive pattern. In adult mice expression is most prominent in Purkinje cells, 
granule cells and pyramidal neurons of the hippocampus and some large neurons in the 
medulla oblongata, nucleus dentatus and pons. 

The novel Neuropathy Target Esterase/Swiss Cheese protein family member 
described in this invention is therefore anticipated to have similar biochemical and 
physiological roles as described above for family members. 

The protein similarity information, expression pattern, cellular localization, and map 
location for the NOV7 protein and nucleic acid disclosed herein suggest that this Neuropathy 
target esterase/Swiss Cheese protein-like protein may have important structural and/or 
physiological functions characteristic of the Neuropathy target esterase/Swiss Cheese protein 
family. Therefore, the nucleic acids and proteins of the invention are useful in potential 
diagnostic and therapeutic applications and as a research tool. These include serving as a 
specific or selective nucleic acid or protein diagnostic and/or prognostic marker, wherein the 
presence or amount of the nucleic acid or the protein are to be assessed. These also include 
potential therapeutic applications such as the following: (i) a protein therapeutic, (ii) a small 
molecule drug target, (iii) an antibody target (therapeutic, diagnostic, drug targeting/cytotoxic 
antibody), (iv) a nucleic acid useful in gene therapy (gene delivery/gene ablation), (v) an 
agent promoting tissue regeneration in vitro and in vivo^ and (vi) a biological defense 
weapon. 

The nucleic acids and proteins of the invention have applications in the diagnosis 
and/or treatment of various diseases and disorders. For example, the compositions of the 
present invention will have efficacy for the treatment of patients suffering from: cancer, 
trauma, regeneration (in vitro and in vivo), viral/bacterial/parasitic infections, 
cardiomyopathy, atherosclerosis, hypertension, congenital heart defects, aortic stenosis, 
atrial septal defect (ASD), atrioventricular (A-V) canal defect, ductus arteriosus, pulmonary 
stenosis, subaortic stenosis, ventricular septal defect (VSD), valve diseases, tuberous 
sclerosis, scleroderma, obesity, aneurysm, hypertension, fibromuscular dysplasia, stroke, 
scleroderma, obesity, transplantation, myocardial infarction, embolism^ cardiovascular 
disorders, bypass surgery, anemia , bleeding disorders, scleroderma, tran^lantation, 
adrenoleukodystrophy , congenital adrenal hyperplasia, diabetes. Von Hippel-Lindau (VHL) 
syndrome, pancreatitis, hyperparathyroidism, hypoparathyroidism, hyperthyroidism, 
hypothyroidism, SIDS, endometriosis, fertility, xerostomia , scleroderma, hypercalceimia, 
ulcers, cirrhosis, inflammatory bowel disease, diverticular disease, Hirschsprung's disease, 
Crohn's Disease, appendicitis, hemophilia, hypercoagulation, idiopathic thrombocytopenic 
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purpura, autoimmune disease, allergies, immunodeficiencies, transplantation, graft versus 
host disease, anemia, ataxia-telangiectasia, autoimmune disease, immunodeficiencies, 
hemophilia, hypercoagulation, idiopathic thrombocytopenic purpura, allergies, 
immunodeficiencies, transplantation, graft versus host disease (GVHD), lymphaedema, 
tonsilitis, hypogonadism, osteoporosis, hypercalcemia, arthritis, ankylosing spondylitis, 
scoliosis, arthritis, tendinitis, muscular dystrophy, Lesch-Nyhan syndrome, myasthenia 
gravis, dental disease, Alzheimer's disease, stroke, tuberous sclerosis, hypercalceimia, 
Parkinson's disease, Huntington's disease, cerebral palsy, epilepsy, multiple sclerosis, 
leukodystrophies, behavioral disorders, addiction, anxiety, pain, neurodegeneration, 
endocrine dysfunctions, diabetes, obesity, growth and reproductive disorders, multiple 
sclerosis, leukodystrophies, pain, neuroprotection, systemic lupus erythematosus, 
autoimmune disease, asthma, emphysema, scleroderma, allergy, ARDS, pharyngitis, 
laryngitis, diabetes, tuberous sclerosis, hearing loss, tinnitus, psoriasis, actinic keratosis, 
tuberous sclerosis, acne, hair growth/loss, allopecia, pigmentation disorders, endocrine 
disorders, psoriasis, actinic keratosis, tuberous sclerosis, acne, hair growth/loss, allopecia, 
pigmentation disorders, endocrine disorders, cystitis, incontinence, diabetes, autoimmune 
disease, renal artery stenosis, interstitial nephritis, glomerulonephritis, polycystic kidney 
disease, systemic lupus erythematosus, renal tubular acidosis, IgA nephropathy, 
hypercalceimia, vesicoureteral refluxas well as other diseases, disorders and conditions. 

The novel nucleic acid encoding the novel Neuropathy Target Esterase/Swiss Cheese 
protein-like protein of the invention, or fragments thereof, are useful in diagnostic 
applications, wherein the presence or amount of the nucleic acid or the protein are to be 
assessed. These materials are further useful in the generation of antibodies that bind 
immunospecifically to the novel substances of the invention for use in therapeutic or 
diagnostic methods. These antibodies may be generated according to methods known in the 
art, using prediction from hydrophobicity charts, as described in the "Anti-NOVX 
Antibodies" section below. The disclosed NOV7 protein has multiple hydrophilic regions, 
each of which can be used as an immunogen. In one embodiment, a contemplated NOV7 
epitope is from about amino acids 10 to 100. In another embodiment, a contemplated NOV7 
epitope is from about amino acids 205 to 220. In other specific embodiments, contemplated 
NOV7 epitopes are from about amino acids 310 to 415, 510 to 520, 570 to 580, 700 to 800, 
820 to 970, 1030 to 1210 and 1370 to 1410. 
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NOV8 

A disclosed NOV8 nucleic acid (alternatively referred to herein as CG571 19~01) 
encodes a novel Acid-Sensitive Potassium Channel Protein Task-like protein and includes the 
815 nucleotide sequence (SEQ ID NO:23) shovm in Table 8A. An open reading frame for 
the mature protein was identified beginning with an GTG codon at nucleotides 2-4 and 
ending with a TO A codon at nucleotides 638-640. Putative untranslated regions are 
underlined in Table 7A, and the start and stop codons are in bold letters. 



Table 8A. NOV8 Nucleotide Sequence (SEQ ID NO:23) 

GGXGGGCGCTGCTGTCTTCGACGCGCTCGAGTCCGAGGCGGAAAGCGGCCGCCAGCGACTGCTGGTCCAGAAGCGG 

GGCGCTCTCCGGAGGAAGTTCGGCTTCTCGGCCKSAGGACrACCGCGAGCTGGAGCGCCT 

CCC3^CCGCGCCGGCCGC<^GTGGAAGTTCCCCGGCrCCTTCTACTTCGCCATCACCGTCATC^ 

CGGCCACGCCGCGCCGGGTACGGACrCCGGCAAGGTCTTCTGCATGTTCTACGCGCTCCTGGGCA 

CTGGTCACrrTCCAGAGCCTGGGCGAACGGCTGAACGCGGTGGTGCGGCGCCTCCTGTTGGCGGC 

TGGGCCrTGCGGTGGACGXGCGTGTCCACGGAGT^CCTGGTGGTGGCa^GGCTCCTGGC^ 

CCTCGGGGCCGTCGCCTTCTCGCACrTCGAGGGCTGGACCTTCTTCCACGCCTACTACTACTGCTT^ 

ACCACCATCGGCTTOSGCGACAACCTGGGCTTTTCGCCCCCCTCGAGCCCGGGGGTCGTGCGTGGCGGGCAGGCTC 

CCAGGCTTGGGGCCCX3GTGGAAGTCaiTCTC ACAACCCCACCCAGGCCAGGGTCGAATCTGGAATG^ 

GCTTCAGCTATCAGGGCACCCTCCCCAGGGATTGGATACGGATGACGGGCCTCTAGGCGGTCTTC^^ 

GTTTCTCATTACTGTCTGTGGCTAAGTCCCCTCCCTCCTTTCCAAAAATATATTA 

The nucleic acid sequence of NOV8 has 556 of 560 bases (99%) identical to a 
gb:GENBANK-ID:AF2570811acc:AF25708Ll mRNA from Homo sapiens (Homo sapiens 
two pore potassium channel KT3.3 mRNA, complete cds) (E = 5.6e"^^^. 

A disclosed NOV8 polypeptide (SEQ IDNO:24) is 212 amino acid residues in length 
and is presented using the one-letter amino acid code in Table 8B. The SignalP, Psort and/or 
Hydropathy results predict that NOV8 does not have a signal peptide and is likely to be 
plasma membrane with a certainty of 0.6000. In alternative embodiments, a NOV8 
polypeptide is located to the Golgi body with a certainty of 0.4000, the endoplasmic 
reticulum (membrane) with a certainty of 0.3000 or the mitochondrial inner membrane with a 
certainty of 0.1 000. 



Table 8B. Encoded NOV8 Protein Sequence (SEQ ID NO:24) 

VGAAVF0ALESEAESGRQRr.LVQKRGAI.RRKFGFSAEDYRELERI.AIiQAEPHRAGRQWKFPGSFYFAITVITTI 
EYGHAAPGTDSGKVFCMFYALLGIPLTLVTFQSLGERIJSrAVVRRI.LIJ^KCCLGLRWTCVSTENLVVAGLI^ 
ATIJU^GAVAFSHFEGWTFFHAYYYCFITLTTIGFGDNLGFSPPSSPGVVRGGQAPRLGARWKSI 

The NOV8 amino acid sequence was found to have 1 84 of 1 84 amino acid residues 
(100%) identical to, and 184 of 184 amino acid residues (100%) similar to, the 330 amino 
acid residue ptnr:TREMBLNEW-ACC:CAC 14068 protein from Homo sapiens (Human) 
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(DJ781B1 .1 (A NOVEL PROTEIN SIMILAR TO THE ACID-SENSITIVE POTASSIUM 
CHANNEL PROTEIN TASK (KCNK3))) (E = 8.8e'"). 

NOV8 is expressed in at ieast the following tissues: pancreas, placenta, brain, Jung, 
prostate, heart, kidney, uterus, small intestine and colon. Expression information was derived 
from the tissue sources of the sequences that were included in the derivation of the sequence 
ofNOVS. 

Possible small nucleotide polymorphisms (SNPs) found for NOV8 are listed in Table 

8C. 



Table 8C: SNPs 


Variant 


Nucleotide 
Position 


Base Change 


Amino Acid 
Position 


Base Change 


13376993 


225 


A>G 


75 


GlU>Gly 


13376995 


605 


G>A 


202 


Ala^Thr 


13376995 


615 


T>C 


205 


Leu>Pro 



NOV8 also has homology to the amino acid sequences shown in the BLASTP data 
listed in Table 8D. 



Table 8D. BLAST results for NOV8 


Gene Index/ 
Identifier 


Protein/ Organism 


Length 
(aa) 


Identity 
(%) 


Positives 
(%) 


Expect 


gi| 10944275 |etnbl 
CAC14068,l| 
{AL118522) 
dJ781Bl,l 


Two pore 
potassium channel 
KT3.3 (LOC&4181) 
[Homo sapiens] 


330 


184/184 
(100%) 


184/184 
(100%) 


2e-'88 


gi jll641275|ref 1 
NP_07 1753.1 1 
(NM_022358) 


potassium family, 
svLbfamily 
member 15; two 
pore potassium 
channel KT3.3; 
potassium 
channel , 
subfamily K, 
member 14 [Homo 
sapiens] 


330 


183/184 
(99%) 


183/184 
(99%) 


le-87 


gi 1 14771013 |ref I 
XP_029815.1 j 
{XM__029815) 


potassium 
channel, 
sxibfaraily K, 
member 14 [Homo 
sapiens] 


330 


183/184 
(99%) 


183/184 
(99%) 


2e-87 



67 



gi|7706l35tref jN 
P_057685.l| 
(NM_016601) 


potassium 
channel , 
subfamily K, 
member 9; 
potassium channel 
TASK3 ; acid- 
sensitive 
potassium channel 
protein TASK- 3; 
TWiK-related 
acid- sensitive K+ 
channel 3 [Homo 
sapiens] 


374 


123/184 
(66%) 


141/184 
(75%) 


2e-65 


gi|l3431425|sp|Q 
9 JL58 1 CIW9_CAVPO 


Potassium channel 
subfamily K 
member 9 (Acid- 
sensitive 
potassium channel 
protein TASK- 3) 
(TWIK-related 
acid- sensitive K+ 
channel 3) 


365 


124/184 
(67%) 


140/184 
(75%) 


le-64 



The homology of these sequences is shown graphically in the ClustalW analysis 
shown in Table 8E. 



Table 8E. ClustaJW Analysis of NOV8 



1) 


NOV8 


(SEQ 


ID 


NO: 


24) 


2) 


gi 


10944275 1 


(SEQ 


ID 


NO: 


191) 


3) 


gi 


11641275 1 


(SEQ 


ID 


NO: 


192) 


4) 


gi 


14771013 1 


(SEQ 


ID 


NO: 


193) 


5) 


gi 


7706135) 


(SEQ 


ID 


NO: 


194) 


6) 


gi 


13431425 1 


(SEQ 


ID 


NO: 


195) 



NOV8 


gi 


10944275 


gi 


1164X275 


gi 


14771013 


gi 


77061351 


gi 


13431425 1 



H0V8 

gi 1 10944275 I 
gi 1 11641275 I 
gi 1 14771013 I 
gii 7706135 1 
gi 113431425 I 



NOV8 

gi 1 10944275 I 
gi I 116412751 
gi 114771013 I 
gi[7706135| 
gi 1 13431425 I 




90 



100 



110 



120 
-.1 



pgspyfaitvitti|yghaapgtdsgkvfcmfyallgipx. 

PGSFYFAITVITTigYGHAAPGTDSGKVFCMFYALLGIPL 
PGSFYFAITVITTIGYGHAAPGTDSGKVFCMFYALLGIPL 
PGSFYFAITVITTIGYGHAAPGTDSGKVPCMFYALLGIPL 
jGSFYFAITVITTIGYGHAAPGTD|GKgFCMFYAiLGIPL 
'GSFYFAITVITTIGYGHAAPGTDiGK*FCMFYAifLGIPL 
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N0V8 


gi 


10944275 


gi 


11641275 


gi 


14771013 


gi 


77061351 


gi 


13431425 



MTOVS 


gi 


10944275] 


gi 


11641275 i 


gi 


14771013 i 


gi 


7706135 1 


gi 


134314251 



160 



NOV8 

gi 1 10944275 I 
gi 1116412751 
gi 114771013 1 
gi|7706135| 
gi 1 13431425 1 



NOV8 

gi 1 10944275 j 
gi 1 11641275 I 
gij 14771013 ( 
gi|7706135| 
gij 13431425 I 



NOV8 

gi 1 10944275] 
gij 11641275 I 
gij 14771013 I 
gi j 7706135 I 
gij 13431425 I 



NOV8 

gi( 10944275 I 
gij 11641275 I 
gi [147710131 
gi [77061351 
gij 13431425 I 



NOV8 

gi I 10944275 
gij 11641275 
gij 14771013 
gij 7706135 1 
gi 1 13431425 




170 



180 



190 



200 




290 



300 



310 



320 



pORYRGEQQPIi O^ C 
330 




(CHV 

|CHV 

SQDYGGRSVAPQNSF S 
Q PQN-FG 



340 



350 



360 




AKLAPHYFHSISj 
ATUlPQPIiHSXS* 

370 




SFTDH< 



Duprat et ah {EMBO J 1997;16:5464-71) identified TASK as a new member of the 
recently recognized TWIK K+ channel family. This 395 amino acid polypeptide has four 
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transmembrane segments and two P domains. In adult human, TASK transcripts are found in 
pancreas<^Iacenta<brain<lung, prostate<heart, kidney<uterus, small intestine and colon. 
Electrophysiological properties of TASK were determined after expression in Xenopus 
oocytes and COS cells. TASK currents are K+-selective, instantaneous and non-inactivating. 
They show an outward rectification when external [K+] is low ([K+]out = 2 mM) which is 
not observed for high [K+]out {98 mM). The rectification can be approximated by the 
Goldman-Hodgkin-Katz current^ equation that predicts a curvature of the current-voltage plot 
in asymmetric K+ conditions. This strongly suggests that TASK lacks intrinsic voltage 
sensitivity. The absence of activation and inactivation kinetics as well as voltage 
independence are characteristic of conductances referred to as leak or background 
conductances. For this reason, TASK is designated as a background K+ channel. TASK is 
very sensitive to variations of extracellular pH in a narrow physiological range; as much as 
90% of the maximum current is recorded at pH 7.7 and only 10% at pH 6.7. This property is 
probably essential for its physiological function, and suggests that small pH variations may 
serve a communication role in the nervous system. 

Lesage et al {EMBO J \996\15:\Q04A 1) isolated a new human weakly inward 
rectifying K+ channel, TWlK-1 . This channel is 336 amino acids long and has four 
transmembrane domains. Unlike other mammalian K+ channels, it contains two pore- 
forming regions called P domains. Genes encoding structural homologues are present in the 
genome of Caenorhabditis elegans. TWIK-1 currents expressed in Xenopus oocytes are 
time-independent and present a nearly linear I-V relationship that saturated for 
depolarizations positive to O m V in the presence of internal Mg2+. This inward rectification 
is abolished in the absence of internal Mg2+. TWIK-1 has a unitary conductance of 34 pS 
and a kinetic behavior that is dependent on the membrane potential. In the presence of 
internal Mg2+, the mean open times are 0.3 and 1.9 ms at -80 and +80 mV, respectively. ITie 
channel activity is up-regulated by activation of protein kinase C and down-regulated by 
internal acidification. Both types of regulation are indirect. TWIK-1 channel activity is 
blocked by Ba2+(IC50=100 microM), quinine (IC50=50 microM) and quinidine (IC50==95 
microM), This channel is of particular interest because its mKNA is widely distributed in 
human tissues, and is particularly abundant in brain and heart. TWIK-1 channels are probably 
involved in the control of background K+ membrane conductances. 

The first member of this family (TOKl) cloned from S.cerevisiae is predicted to have 
eight potential transmembrane (TM) helices. However, subsequently-cloned two P-domain 
family members from Drosophila and mammalian species are predicted to have only four TM 
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segmmts. They are usually referred to as TWIK-related channels (Tandem of P-domains in a 
Weakly Inward rectifying K+ channel). Functional characterization of these channels has 
revealed a diversity of properties in that they may show inward or outward rectification, their 
activity may be modulated in different directions by protein phosphorylation, and their 
sensitivity to changes in intracellular or extracellular pH varies. Despite these disparate 
properties, they are all thought to share the same topology of four TM segments, including 
two P-domains. That TWIK-related K+ channels all produce instantaneous and non- 
inactivating K+ currents, which do not display a voltage-dependent activation threshold, 
suggests that they are background (leak) K+ channels involved in the generation and 
modulation of the resting membrane potential in various cell types. Further studies have 
revealed that they may be found in many species, including: plants, invertebrates and 
mammals, 

TASK is a member of the TWIK-related (two P-domain) K+ channel family identified 
in human tissues. It is widely distributed, being particularly abundant in the pancreas and 
placenta, but it is also found in the brain, heart, lung and kidney. Its amino acid identity to 
TWIK-1 and TREK-1 is rather low, being about 25-28%. However, it is thought to share the 
same topology of four TM segments, with two P-domains. TASK is very sensitive to 
variations in extracellular pH in the physiological range, changing from fully-open to closed 
in approximately 0.5 pH units around pH 7.4. Thus, it may well be a biological sensor of 
external pH variations. 

The protein similarity information, expression pattern, cellular localization, and map 
location for the protein and nucleic acid disclosed herein suggest that this Acid-Sensitive 
Potassium Channel Protein Task-like protein may have important structural and/or 
physiological functions characteristic of the Ion Channel family. Therefore, the nucleic acids 
and proteins of the invention are useful in potential diagnostic and therapeutic applications 
and as a research tool. These include serving as a specific or selective nucleic acid or protein 
diagnostic and/or prognostic marker, wherein the presence or amount of the nucleic acid or 
the protein arc to be assessed. These also include potential therapeutic applications such as 
the following: (i) a protein therapeutic, (ii) a small molecule drug target, (iii) an antibody 
target (therapeutic, diagnostic, drug targeting/cytotoxic antibody), (iv) a nucleic acid useful in 
gene therapy (gene delivery/gene ablation), (v) an agent promoting tissue regeneration in 
vitro and in v/vo, and (vi) a biological defense weapon. 

The nucleic acids and proteins of the invention have applications in the diagnosis 

and/or treatment of various diseases and disorders. For example, the compositions of the 
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present invention will have efficacy for the treatment of patients suffering from: diabetes. 
Von Hippel-Lindau (VHL) syndrome, pancreatitis, obesity, fertility, Alzheimer's disease, 
stroke, hypercalceimia, Parkinson's disease, Huntington's disease, cerebral palsy, epilepsy, 
Lesch-Nyhan syndrome, multiple sclerosis, ataxia-telangiectasia, leukodystrophies, 
behavioral disorders, addiction, anxiety, pain, neurodegeneration, systemic lupus 
erythematosus, autoimmune disease, asthma, emphysema, scleroderma, allergies, ARDS, 
cardiomyopathy, atherosclerosis, hypertension, congenital heart defects, aortic stenosis, atrial 
septal defect (ASD), atrioventricular (A-V) canal defect, ductus arteriosus, pulmonary 
stenosis, subaortic stenosis, ventricular septal defect (VSD), valve diseases, tuberous 
sclerosis, transplantation, renal artery stenosis, interstitial nephritis, glomerulonephritis, 
polycystic kidney disease, renal tubular acidosis, IgA nephropathy, endometriosis, 
inflammatory bowel disease, diverticular disease, as well as other diseases, disorders and 
conditions. 

The novel nucleic acid encoding the novel protein of the invention, or fragments 
thereof, are useful in diagnostic applications, wherein the presence or amount of the nucleic 
acid or the protein are to be assessed. These materials are further useful in the generation of 
antibodies that bind immunospecifically to the novel substances of the invention for use in 
therapeutic or diagnostic methods. These antibodies may be generated according to methods 
known in the art, using prediction fix)m hydrophobicity charts, as described in the "Anti- 
NOVX Antibodies" section below. The disclosed NOV8 protein has multiple hydrophilic 
regions, each of which can be used as an immunogen. In one embodiment, a contemplated 
NOV8 epitope is from about amino acids 20 to 30. In another embodiment, a contemplated 
NOV8 epitope is from about amino acids 41 to 45. In other specific embodiments, 
contemplated NOV8 epitopes are from about amino acids 49 to 55, 70 to 75 and 190 to 205* 

NOV9 

A disclosed NOV9 nucleic acid (designated as CuraGen Acc, No. CG57143-01), 
encodes a novel Ribosomal protein -like protein and includes the 71 1 nucleotide sequence 
(SEQ ID NO:25) shown in Table 9A. An open reading frame for the mature protein was 
identified beginning with an ATG codon at nucleotides 44-46 and ending with a TAG codon 
at nucleotides 674-676. The start and stop codons are in bold letters in Table 9A. 
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Table 9A. NOV9 Nucleotide Sequence (SEQ ID NO:25) 

TCTCTCTCTCTCTCrrCTCTCTCTGGTGAACAGGACCCGTCGCCATGGGCCGTGTGATCCGT^^ 

CGCCGGGTCTGTGTTCCGCGCGCACGTGAAGCACCGTAAAGGCGCTGCGCGCCTGCGCGCCGTGGACT^ 

GCGGCACGGCTACATCAAGGGCATCGTCAAGGCCCAGCTCAACATTGGCAATGTGCTCCCT^ 

TGAGGGTACAATCGTGTGCTGCCTGGAGGAGAAGCCTGGAGACCGTGGCy^GCTGGCCCGGGCATCAGGGAACTA 

TGCCACCGTTATCTCCCACAACCCTGAGACCAAGAAGACCCGTGTGAAGCTGCCCTCCGGCTCC^ 

CTCCTCAGCCAACAGAGCTGTGGTTGGTGTGGTGGCTGGAGGTGGCCGAATTGACaAACCC^^ 

CCGGGCGTACCACAAATATAAGGCAAAGAGGAACTGCTGGCCACGAGTACGGGGTGTGGCCATGAATC^ 

GCATCCTTTTGGAGGTGGCAACCACCAGCACATCGGCAAGCCCTCC^CCTVTCaSC^^ 

CAAAGTGGGTCTCATTGCTGCCCGCCGGACTGGACGTCTCCGGGGAACCAAGACTGTGCAGGAGAAAGAGAACTA 
6TGCTGAQGGCCTCAATAAAGTTTGTGTTTATGCCA 

The nucleic acid sequence of NOV9 maps to chromosome 8 and has invention has 
574 of 610 bases (94%) identical to a gb:GENBANK-ID:HSRBPL8|acc:Z28407.1 mRNA 
from Homo sapiens (H.sapiens mRNA for ribosomal protein L8) (E = 9.9e"^^^). 

The NOV9 polypeptide (SEQ ID NO:26) is 210 amino acid residues in length and is 
presented using the one-letter amino acid code in Table 9B. The SignalP, Psort and/or 
Hydropathy results predict that NOV9 does not have a signal peptide and is likely to be 
localized to the nucleus with a certainty of 0.9749. In alternative embodiments, a NOV9 
polypeptide is located to the mitochondrial matrix space with a certainty of 0.4248, the 
microbody (peroxisome) with a certainty of 03000, or the lysosome (lumen) with a certainty 
ofO.2783. 

Table 9B. Encoded NOV9 Protein Sequence (SEQ ID NO:26) 

MGRVIRGQRKGAGSVFRAHVKHRKGAARLRAVDFAEimGYIKGIVKAQIJSriGlSnmPVG™^ 
DRGKIARASGNYATVISHNPETKKTRVKLPSGSKKVISSANRAVVGVVAGGGRIDKPIIJCAGRAYHK^ 
WPRVRGVAMNPVEHPFGGGNHQHIGKPSTIRRDAPAGRKVGLIAARRTGRLRGTKTVQEKEJI 

The NOV9 amino acid sequence was found to have 1 70 of 1 96 amino acid residues 
(86%) identical to, and 175 of 196 amino acid residues (89%) similar to, the 257 amino acid 
residue ptnr:SWISSNEW-ACC:P25120 protein from H6mo sapiens (Human), Rattus 
norvegicus (Rat), and (60S RIBOSOMAL PROTEIN L8) (E = L2e ^^). 

NOV9 is expressed in at least the following tissues: granulosa cells, white blood cells, 
bone marrow, liver, lung, placenta and whole organism. Expression information was derived 
from the tissue sources of the sequences that were included in the derivation of the sequence 
ofNOV9. 

Possible small nucleotide polymorphisms (SNPs) found for NOV9 are listed in Table 

9C. 
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Table 9C: SNPs 


Variant 


Nucleotide 
Position 


Base Change 


Amino Acid 
Position 


Base Change 


13376997 


152 


OT 


37 


Arg>Trp 


13376996 


611 


OT 


190 


Leu>Phe 



NOV9 also has homology to the amino acid sequences shown in the BLASTP data 
listed in Table 9D. 



Table 9D. BLAST results for NOV9 


Gene Jndeac/ 
Identifier 


Protein/ 
Organism 


Length 
(aa) 


Identity 
(%) 


Positives 
(%) 


Expect 


gi| 730576 lsp|P411 
16 [RL8_XENLA 


60S RIBOSOMAL 
PROTEIN L8 


257 


204/257 
(79%) 


210/257 
(81%) 


2e-92 


gi 1 4506663 IrefjNP 
_000964.1| 
(NM_000973) 


ribosomal 
protein 1*8; 
60S ribosomal 
protein L8 
[Homo 
sapiens} 


257 


210/257 
(81%) 


210/257 
(81%) 


2e-89 


gi 1 15082586 jgb|AA 
H12197 . 1 1 AAH12197 
(BC012197) 


Similar to 
ribosomal 
protein 1*8 
EHomo 
sapiens] 


257 


209/257 
(81%) 


210/257 
(81%) 


3e-89 


gi|l529388l|gb|AA 
K95133.l|AF40i56l 
_1 {AF401561) 


ribosomal 
protein L8 
[Ictalnrus 
punctatus] 


257 


198/257 
(77%) 


204/257 
(79%) 


3e-86 


gi|l2652605lgblAA 
H00047 . 1 1 AAH00047 
(BC000047) 


Similar to 
ribosomal 
protein 1*8 

[Homo 
sapiens] 


214 


170/196 
(86%) 


175/196 
(88%) 


36-75 



The homology of these sequences is shown graphically in the ClustalW analysis 
shown in Table 9E. 



Table 9E. ClustalW Analysis of NOV9 

(SEQ ID NO:26) 
(SEQ ID NO: 196) 
(SEQ ID NO: 197) 
(SEQ ID NO: 198) 
(SEQ ID NO: 199) 
(SEQ ID NO: 200) 



10 20 30 40 50 60 



N0V9 

gi 1 730576 I 




1) NOV9 

2) gi|730576[ 

3) gi|4506663( 

4) giil5082586j 

5) gijl529388lj 

6) gi 1 12652605 I 
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gi I 4506663 I 
gij 15082586 I 
gi I 152538811 
gij 12652605 I 



NOV9 

gi 1 730576 I 
gij 4506663 I 
gij 15082586 I 
gi 1 15293881 1 
gij 12652605 I 




NOV9 


74 


gi 1 730576 j 


121 


gi|4506663| 


121 


gij 150825861 


121 


gi|l529388l| 


121 


gi 1 12652605 j 


78 


NOV9 


134 


gi 1730576 I 


181 


gi 14506663 1 


181 


gi| 150825861 


181 


gij 152938811 


181 


gij 12652605 1 


138 


irov9 


194 


gi|730576| 


241 


gi|4506663| 


241 


gij 15082586] 


241 


gi| 15293881 1 


241 


gi| 12652605 i 


198 



1 . 



13 0 
..I 



140 



150 



160 



I 



170 



I. 



180 
.^1 



GDRGKLARASGNYATVISHNPETKKTRVKLPSGSKB^VISSANRAWGWAGGGRIDKPIL 
GDRGKLARASGNYATVISHNPETKKTRVKLPSGSKKVISSANRAfVGWAGGGRIDKPIL 
GDRGKLARASGNYATVISHNPETKKTRVKLPSGSKKVISSANRAWGWAGGGRIDKPIL 
GDRG KLARASGNYATVI SHNPETKKTRVKLPSGS KiO/I SSANRA WGWAGGGRI DK? IL 
GDRGKLARASGNYATVI SHNPETKK|rVKL? SG|:<KVI S sJSnRAWGWAGGGRIDK? IL 
GDRGKLARASGNYATVISHNPETKKTRVKLPSGSKKVISSANRAWGWAGGGRIDKPIL 



190 



200 



210 



220 



230 



240 



KAGRAYHKYKAKRNCWPRVRGVAr>^iNPVEKP?GGGNKQHIGKPSTIRRDAPAGRKVGLIAA 
KAGRAYHKYKAKRNCWPRVRGVAMNPVEHPFGGGNKQHIGKPSTIRRDAPAGRKVGLIAA 
}CA.GRAYKKYKAKRKC/^7PRWGVAMNP VEHPFGGGNKQHI GKPST IRRDAPAGRKVGL I AA- 
KAGRJ^YRKYKAKRNCWPRVRGVAMNPVEHP FGGGNHQHI GKPST IRRD APAGRKVGLI AA 
KAGRAYHKYKJJkRNCWPRVRGVAT^NP VEHPFGGGNHQH I GKPST IRRC^PAGRKVCL I AA 
KAGRAYHKYKAKRNCWPRVRGVAMNPVEHPFGGGNHQHIGKPSTIRRDAPAGRK^^GLIAA 



193 
240 
240 
240 
240 
197 



250 




Table 9F lists the domain description from DOMAIN analysis results against NOV9. 
This indicates that the NOV9 sequence has properties similar to those of other proteins 
known to contain these domains. 



75 



Table 9F. Domain Analysis of NOV9 






gnl [Pfamlpf amOOlSl, Ribosoinal__L2 , Ribosomal Proteins L2. 
CD- Length = 229 residues, 100.0% aligned 
Score = 177 bits (450) , Expect = 4e-46 








46 




1 1 h 1 ll+tl 1 lilM II 
Sbj 1 GRNNRGHITRIOJRGGGHKRLYRAIDFKRRKGYIKGTVKRIEYDPNRSAP 


60 




N0V9 :47 AQLNIGNVLPVGTMPEGTIVCCI.EEKPGDRGKLARASGN 

1 + imihl +11111+ +1I1III Wlll+I 
Sbj 61 KRYIL2^EC^HVGDTiySGKNATIKIGNVLPI*GEIPEGTIIHNVEEKP(a>GGQIJyU^ 


85 
120 




KOV9 : 86 YATVISHNPETKKTRVKI,PSGSKKVISSANRAW6VVaGGGRIDK^ 

II + II nil mi K +11 II +11111111111+ Mill ++ t 

Sbj r 121 YAQIIAHDGD- KKTRViCLPSGEKRRVSSECRATIGVVANGGRIDKPIiGKAGRA- -RWLGK 


145 

177 




N0V9 : 14 6 RNCWPRVRGVAMNPVEHPFGGGNHQHIGKPSTIRRDAPAGRKVGLIAARRTGRLRGT 2 0 2 { SEQ 

1 immiin+ii iii +i i i+i ii in ii 


ID NO: 201) 


Sbjr 178 R PRVRGVAMNPVDHPHGGGEGRHP- - IGRKSPVTPWGKKALGIATRRTKRLSDK 229 (SEQ 


ID NO:202) 



The mammalian ribosome is composed of 4 RNA species {see 180450) and 
approximately 80 different proteins (see 1 80466), 

The rat ribosomal protein L8 (Rpl8) associates with 5.8S rRNA, very likely 
participates in the binding of aminoacyl-tRNA, and has been identified as a constituent of the 
EF2 (130610)-binding site at the ribosomal subunit interface. By screening a human ovarian 
granulosa cell cDNA expression library with antibodies against human follicular fluid 
glycoproteins, Hanes et al (1993) isolated a partial RPL8 cDNA. They completed the full- 
length cDNA sequence using PCR. The deduced 257-amino acid human RPL8 protein is 
identical to rat Rpl8. Northern blot analysis detected a 900-bp RPL8 transcript in human 
granulosa cells and white blood cells. By somatic cell hybrid and radiation hybrid mapping 
analyses, Kenmochi et al (1998) mapped the human RPL8 gene to 8q. 

Ribosomal_L2 (Ribosomal Proteins hi), amino acid 13 to 46 and 47 to 210. 
Ribosomal protein L2 is one of the proteins from the large ribosomal subunit. In Escherichia 
coli, L2 is known to bind to the 23 S rRNA and to have peptidyltransferase activity. It 
belongs to a family of ribosomal proteins which, on the basis of sequence similarities, groups: 
Eubacterial L2, Algal and plant chloroplast L2, Cyanelle L2, Archaebacterial L2, Plant L2, 
Slime mold L2, Marchantia polymorpha mitochondrial L2, Paramecium tetraurelia 
mitochondrial L2, Fission yeast K5, K37 and KD4, Yeast YL6, Vertebrate L8. See Interpro 
IPR00217]: 

The protein similarity information, expression pattern, cellular localization, and map 
location for the protein and nucleic acid disclosed herein suggest that this Ribosomal Protein 
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-like protein may have important structural and/or physiological functions characteristic of 
the Ribosomal Proteins family. Therefore, the nucleic acids and proteins of Ihe invention are 
useful in potential diagnostic and therapeutic applications and as a research tool. These 
include serving as a specific or selective nucleic acid or protein diagnostic and/or prognostic 
marker, wherein the presence or amoimt of the nucleic acid or the protein are to be assessed. 
These also include potential therapeutic applications such as the following: (i) a protein 
therapeutic, (ii) a small molecule drug target, (iii) an antibody target (therapeutic, diagnostic, 
drug targeting/cytotoxic antibody), (iv) a nucleic acid useful in gene therapy (gene 
delivery/gene ablation), (v) an agent promoting tissue regeneration in vitro and in vivo, and 
(vi) a biological defense weapon. 

The nucleic acids and proteins of the invention have applications in the diagnosis 
and/or treatment of various diseases and disorders. For example, the compositions of the 
present invention will have efficacy for the treatment of patients suffering firom: hemophilia, 
hypercoagulation, idiopathic thrombocytopenic purpura, autoimmune disease, allergies, 
asthma, immunodeficiencies, transplantation, graft versus host disease. Von Hippel-Lindau 
(VHL) syndrome, cirrhosis, systemic lupus erythematosus, emphysema, scleroderma, ARDS, 
fertility as well as other diseases, disorders and conditions. 

The novel nucleic acid encoding the novel Ribosomal Protein -like protein of the 
invention, or fragments thereof, are useful in diagnostic applications, wherein the presence or 
amount of the nucleic acid or the protein are to be assessed. These materials are further 
useful in the generation of antibodies that bind immunospecifically to the novel substances of 
the invention for use in therapeutic or diagnostic methods. These antibodies may be 
generated according to methods known in the art, using prediction from hydrophobicity 
charts, as described in the "Anti-NOVX Antibodies'' section below. The disclosed NOV9 
protein has multiple hydrophilic regions, each of which can be used as an immunogen. In 
one embodiment, a contemplated NOV9 epitope is from about amino acids 10 to 1 5. In 
another embodiment, a contemplated NOV9 epitope is from about amino acids 40 to 42. In 
other specific embodiments, contemplated NOV9 epitopes are from about amino acids 55 to 
57, 70 to 75, 90 to 95, 99 to 1 10, 135 to 150, 155 to 175, 180 to 183, 190 to 193 and 199 to 
201. 

NOVIO 

A disclosed NOVIO is nucleic acid (designated as CuraGen Acc. No. CG56860-01, 
encodes a novel Prostaglandin Omega Hydroxylase-like protein and includes the 1503 
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nucleotide sequence (SEQ ID NO:27) shown in Table lOA. An open reading frame for the 
mature protein was identified beginning with an ATG codon at nucleotides 1 1-14 and ending 
with a TAG codon at nucleotides 1493-1495. Putative untranslated regions downstream from 
the termination codon are underlined in Table lOA, and the stop codon is in bold letters. 



Table lOA- NOVIO Nucleotide Sequence (SEQ ID NO:27) 

GTGCTGCGGCA TQAGTGTCTCTGTGCTGAACCCCAACAGACTCCCAGATGGTGTCTCAGGGCTCCTCCAAGGAGC 

CTCACTGCTGAGCCTGCTTCTGTTACTATTGAAGGCAGCCCAGCCCTACCTGCGGAGGCAGCGGCTGCTGCGGGA 

CCTGCGCCCCTTCCCAGCGCCCCCCACCCACTGGTTCCTTGGGCACAAGCTGATGGAAAAATACCCATGTGCTGT 

TCCCTTGTGGGTTGGACCCTTTACGATGTTCTTCAGTGTCCATGACCCAGACTATGCCAAGATTCTCCTGAAAAG 

ACAAGGTAAAAACCAAGAGGGGTTTCTGCCTTTTATTTCTCAAGGAAAAGGACTAGCGGCTCTAGACGGACCCAA 

GTGGTTCCAGCATCGTCGCCTACTAACTCCTGGATTCCATTTTAACATCCTGAAAGCATACATTGAGGTGATGGC 

TCATTCTGTGAAAATGATGCTGAACAAATGGGAGGAACACATTGCCCAAAACTCACGTCTGGAGCTCTTTCAACA 

TGTCTCCCTGATGACCCTGGACAGCATCATGAAGTGTGCCTTCAGCCACCAGGGCAGCATCCAGTTGGACAGGTC 

ATCATACCTGAAAGCAGTGTTCAACCTTAGCAAAATCTCCAACCAGCGCATGAACAATTTTCTACATCAC 

CCTGGTTTTCAAATTCAGCTCrCAAGGCCAAATCTTTTCTAAATTTAACCAAGAACTTCATCAGCATCTAGAGAA 

AGTAATCCAGGACCGGAAGGAGTCTCTTAAGGATAAGCTAAAACAAGATACTACTCAGAAAAGGCGCTGGGATTT 

TCTGGACATACTTTTGAGTGCCAAAGTAGAAAACACCAAAGATTTCTCTGAAGCAGATCTCCAGGCTGAAGTGAA 

AACGTTCATGTTTGCAGGACATGACACCACATCCAGTGCTATCTCCTGGATCCTTTACTGCTTGGCAAAGTACCC 

TGAGCATCAGCAGAGATGCCGAGATGAAATCAGGGAACTCCTAGGGGATGGGTCTTCTATTACCTGGCACCTGAG 

CCAGATGCCTTACACCACGATGTGCATCAAGGAATGCCTCCGCCTCTACGCACCGGTAGTAAACATATCCCGGTT 

ACTCGACAAACCCATCACCTTTCCAGATGGACGCTCCTTACCTGCAGGGATCACCGTGGTTCTTAGTATTTGGGG 

TCTTCACCACAACCCTGCTGTCTGGAAAAACGTACAGGTCTTTGACCCCTTGAGGTTCTCTCAGGAGAATTCTGA 

TCAGAGACACCCCTATGCCTACTTACCATTCTCAGCTGGATCAAGGAACTGCATTGGGCAGGAGTTTGCCATGAT 

TGAGTTAAAGGTAACCATTGCCTTGATTCTGCTCCACTTCAGAGTGACTCCAGACCCCACCAGGCCTCTTACTTT 

CCCCAACCATTTTATCCTCAAGCCCAAGAATGGGATGTATTTGCACCTGAAGAAACTCTCT 

AGG 



The nucleic acid sequence of NOVIO maps to chromosome 1 and has 525 of 755 
bases (69%) identical to a gb:GENBANK-ID:HUMCYTFAOHiacc:L0475Ll mRNA from 
Homo sapiens (Human cytochrome p-450 4A (CYP4A) mRNA, complete cds) (E = 1 ,6e"^ 

A disclosed NOVIO polypeptide (SEQ ID NO:28) is 494 amino acid residues in 
length and is presented using the one-letter amino acid code in Table lOB. The SignalP, 
Psort and/or Hydropathy results predict that NOVl 0 has a signal peptide and is likely to be 
localized to the plasma membrane with a certainty of 0.6000. In alternative embodiments, a 
NOVIO polypeptide is located to the Golgi body with a certainty of 0.4000, the endoplasmic 
reticulum (membrane) witii a certainty of 0.3000, or the microbody (peroxisome) with a 
certainty of 03000. The SignalP predicts a likely cleavage site for a NOVIO peptide between 
amino acid positions 35 and 36, i.e. at the sequence KAA-QP. 



Table lOB^ Encoded NOVIO Protein Sequence (SEQ ID NO:28) 

MiviviSipMSipDGvsGi^ 

LWVGPFTMFFSVHDPDYJUCILLKRQGKNQEGFLPFISQGKGLAjaJ>3PK^ 
AHSVKMMimWEEHIAQNSRLELFQHVSIiMTLDSIMKCAFSHQGSIQLDRSSYL^^ 

HISIDLVFKFSSQGQIFSKFNQEIJiQHLEKVIQDRKBSLKDKLKQDTTQKia^ 

QAEVKTFMFAGHDTTSSAISWILYCLAKYPEHQQRCRDEII^LLGDGSSITWHLSQMPYTTMCIKECLRLYA^ 
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VVNISRI*IX>KPITFPDGRSLPAGITVVI.SIWGIJIHNPAWKNVQVFDPI^ 

CIGQEFAMIELKOTIALILLHFRVTPDPTRPLTFPNHFILKPKN<ammKKLSEC 

The NOV] 0 amino acid sequence was found to have 281 of 509 amino acid residues 
(55%) identical to, and 369 of 509 amino acid residues (72%) similar to, the 51 0 amino acid 
residue ptnr:pir-id:A29368 protein from rabbit (prostaglandin omega-hydroxylase (EC 
L14,15.-) cytochrome P450 4A4) (E = 1 Jq^^% 

NOVIO is expressed in at least the following tissues: : Brain, Substantia Nigra, 
Hippocampus, Hypothalamus, Kidney, Lung, Mammary gland/Breast, Parietal Lobe, 
Prostate, and Uterus. Expression information was derived from the tissue sources of the 
sequences that were included in the derivation of the sequence of NOVIO. 

NOVIO also has homology to the amino acid sequences shown in the BLASTP data 
listed in Table IOC. 



Table IOC. BLAST results for NOVIO 


Gene Index/ 
Identifier 


Protein/ 
Organism 


Length 
(aa) 


Identity 
(%) 


Positives 
(%) 


Expect 


gi 1 2493371 | sp [ Q0292 
8 1 CP4Y_HaMAN 
CYTOCHROME P450 
4A1X PRECURSOR 
(CYPIVAll) 


(FATTY ACID 
OMEGA- 
HYDROXYLASE) (P- 
450 HK OMEGA) 

(LAURIC ACID 
OMEGA- 
HYDROXYLASE) 

(CYP4AII) (P4 50- 
HL-OMEGA) 


519 


282/511 
(55%) 


358/511 
(69%) 


e-146 


gi 1 203787 |gb|AAA410 
38. l| (M57718) 


cytochrome P-450 
IVAl [Rattus 
norvegicus] 


509 


269/511 
(52%) 


357/511 
(69%) 


e-145 


gi 1 12832576 | dbj | BAB 
22165. l| (AK002528) 


cytochrome P450, 
4alO-data 
source ;MGD, 
source 

key:MGI: 88611, 
evidence : ISS-put 
ative fMus 
mus cuius] 


509 


271/512 
(52%) 


357/512 
(68%) 


e-145 


gi [3 738263 [dbj |BAA3 
3804.1) {AB018421) 


cytochrome P-450 
[Mus musculus] 


509 


271/512 
(52%) 


357/512 
(68%) 


e-145 
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gi 1 4503235 | ref |NP_0 


cytochrome P450, 


519 


282/511 


358/511 


e-145 


00769. l| 


subfamily IVA, 




(55%) 


(65%) 




(1IM_000778) 


polypeptide 11; 
fatty acid 
omega- 

hydroxyl ase j 
P450HL-oraega; 
alkane-1 
monooxygenase ; 
1 auric acid 
omega - 
hydroxylase 
[Homo sapiens] 











The homology of these sequences is shown graphically in the ClustalW analysis 
shown in Table lOD. 



Table lOD. ClustalW Analysis of NOVIO 



1) 


NOVIO 


(SEQ 


ID 


NO: 


28) 


2) 


gi 


2493371 1 


(SEQ 


ID 


NO: 


203) 


3) 


gi 


203787| 


(SEQ 


ID 


NO: 


204) 


4) 


gi 


12832576 1 


{SEQ 


ID 


NO: 


205} 


5) 


gi 


3738263 | 


(SEQ 


ID 


NO: 


206) 


6) 


gi 


4503235] 


(SEQ 


ID 


NO: 


207) 



NOVIO 

gi |2493371j 
gi [203787 | 
gi 1 12832576 I 
gi 13738263 | 
gi I 4503235 j 



NOVIO 

gi 1 2493371 1 
gi I 203787 I 
gi 1 12832576 I 
gij 3738263 I 
gi 1 45032351 



NOVIO 


61 


GH 


gi|249337l| 


61 


GH 


gi|203787 | 


61 


GM 


gi| 12832576 1 


61 


GK 


gi 137382631 


61 


GK 


gi [4503235 1 


61 


GH 



NOVIO 


165 


gi |2493371| 


181 


gi [203787 1 


180 


gi [128325761 


ISO 


gii3738263| 


180 


gij 45032351 


181 


NOVIO 


223 
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m 



m 



gi|249337l{ 


241 1 


gif 203787 | 


240 


gij 12832576 1 


240 


gi|3738263| 


240 


gl {4503235 [ 


241 j 


NOVia 


282 


gi| 2493371 1 


301 


gi 1 203787 j 


300 


gi 1 12832576 [ 


300 


gi [3738263 1 


300 


gx 1 4503235 | 


301 1 


NOVIO 


342 1 


gi (2493371 1 


3€1 


gi 1203787 I 


360 


gij 1283257 6 1 


360 


gij3738263| 


360 


gi 1 4503235 j 


361 1 


NOVIO 


401 


gi|2493371| 


421 1 


gi [203787 | 


420 i 


gij 12832576 | 


420 j 


gi I 3738263 | 


420 ] 


gi|4503235| 


421 ! 


NOVIO 


461 ] 


gi| 2493371 1 


479 


gi| 203787) 


478 


gij 128325761 


478 


gi (3738263 | 


478 


gij 45032351 


479 





520 
..|. 

SEC 494 

PKPCEDKDQI. 519 
509 
509 
509 

PCEDKDQL 519 



Table lOE lists the domain description from DOMAIN analysis results against 
NOV 10. This indicates that the NOV 1 0 sequence has properties similar to those of other 
proteins known to contain these domains. 
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Table 9E. Domain Analysis of NOVIO 



gnl|Pfam|pf am00067, p450. Cytochrome P450. Cytochrome P450s are involved in 
the oxidative degradation of various compounds. Particularly well known for 
their role in the degradation of environmental toxins and mutagens. Structure 
is mostly alpha, and binds a heme cof actor. 
CD- Length =44 5 residues, 98.9% aligned 
304 bits (778), Expect « 9e-84 

PAPPTHWFLGH KLMEKYPCAVPLWVGPFTMFFSVHDPDYAKILLKRQ 98 

1 It +k -II 1^+11 + I h I -I + 

PGPPPLPIilGNLIiQXjGRGPIHSLTELRKKYGPVFTLYLGPRPWV-VTGPEAVKEVLIDK 



Score = 


304 


NOVIO : 


52 


Sbjct: 


2 


NOVIO : 


99 


Sb j Ct : 


61 


NOVIO: 


156 


Sb j ct : 


119 


NOVIO: 


215 


Sb j Ct : 


176 


NOVIO : 


275 


Sbjct: 


228 


NOVIO : 


334 


Sb j Ct : 


288 


NOVIO: 


392 


Sbjct: 


347 


NOVIO: 


452 


Sbj Ct: 


407 



GKNQEGFLPFISQ GKGIJyU^DGPKWFQHRRLLTPGFHFNILKAYIEVMAHSVKMMLN 

I- I I Ik I lllll MM-- - - 

GEEFAGRGDFPVFPWLGYGILFSNGPRWRQLRRLLTLRF-FGMGKRS-KLEERIQEEARD 

KP?EE - HIAQNSRLELFQHVSLMTLDS IMKCAFSHQGS IQLDRS SYLKAVFNLSKI SNQRM 
I I I +++ + ++ 1+ I t + +1 [ + (++ + 

LVERLRKEQGSPIDITELIiAPAPLNVICSLLFGV- - RFDYEDPEFLKLIDKLNE- LFFLV 

NNFLHHNDLVFKFSSQGQI FSKFNQELHQHLEKVI QDRKESLKDKLKQDTTQKRRWDFLD 

t + I I mi 

S PWGQLIiDFFRYI.PGSHRKAFKAAKDLKDYIJ>KLIEERRETLE PGDPR- 



I I -i 



60 



155 



118 



214 



175 



274 



DFLD 227 



ILL-SAKVENTKDFSEADLQAEVKTFMFAGHDTTSSAISWIDYCLAKYPEHQQRCRDEIR 333 

II til - - -1-1 1 -MMiiii -nil m-!i I - Ml 

SLLIEAKREGGSELTDEELKATVLDLLFAGTDTTSSTLSWALYLLAKHPEVQAKLREEID 287 

ELLGDGSSITW-HLSQMPYTTMCIKECIiRLYAPW-NISRLLDKPITFPDGRSLPAGITV 391 

i-t I I- - III Ml m- I - I- + 11-111 

EVIGRDRSPTYDDRANMPYLDAVIKETLRLHPVVPLIiLPRVATEDTEI-DGYLIPKGTLV 346 

VLSIWGXJHHNPAVWKNVQVFDPLRFSQENSDQRHPYAYLPFSAGSRNCIGQEFAMIEIJKV 451 

II ^M- I - III II II - ll-lll II lll-t- I -II - 

IVNLYSIiHRDPKVFPNPBEFDPBRFIJJENGKFKKSYAFIrPPGAGPRNCLGERLARMELFL 406 



P450 4A4 is a cytochrome P450 that is elevated during pregnancy. This P-450 
isozyme regiospecifically hydroxylates PGEl , PGAl , and PGF2 alpha at carbon-20 (the 
omega position). This enzyme catalyzes the hydroxylation of PGAl in the presence of 
NADPH. 

The protein similarity information, expression pattern, cellular localization, and map 
location for the NOVIO protein and nucleic acid disclosed herein suggest that this 
prostaglandin omega-hydroxylase-like protein may have important structural and/or 
physiological Junctions characteristic of the PG omega/omega-1 hydroxylase family. 
Therefore, the nucleic acids and proteins of the invention are useful in potential diagnostic 
and therapeutic applications and as a research tool. These include serving as a specific or 
selective nucleic acid or protein diagnostic and/or prognostic marker, wherein the presence or 
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amount of the nucleic acid or the protein are to be assessed. These also include potential 
therapeutic applications such as the following: (i) a protein therapeutic, (ii) a small molecule 
drug target, (iii) an antibody target (therapeutic, diagnostic, drug targeting/cytotoxic 
antibody), (iv) a nucleic acid useful in gene therapy (gene delivery/gene ablation), (v) an 
agent promoting tissue regeneration in vitro and in vivo^ and (vi) a biological defense 
weapon. 

The nucleic acids and proteins of the invention have applications in the diagnosis 
and/or treatment of various diseases and disorders. For example, the compositions of the 
present invention will have efficacy for the treatment of patients suffering from: Von Hippel- 
Lindau (VHL) syndrome , Alzheimer's disease. Stroke, Tuberous sclerosis, hypercalceimia, 
Parkinson's disease, Huntington's disease. Cerebral palsy. Epilepsy, Lesch-Nyhan syndrome^ 
Multiple sclerosis. Ataxia-telangiectasia, Leukodystrophies, Behavioral disorders, Addiction, 
Anxiety, Pain, Neuroprotection, Systemic lupus erythematosus , Autoimmune disease. 
Asthma, Emphysema, Scleroderma, allergy. Diabetes, Autoimmune disease, Renal artery 
stenosis. Interstitial nephritis. Glomerulonephritis, Polycystic kidney disease, Systemic lupus 
erythematosus. Renal tubular acidosis, IgA nephropathy, Hypercalceimia as well as other 
diseases, disorders and conditions. 

The novel nucleic acid encoding the Prostaglandin Omega Hydroxy lase-like protein 
of the invention, or firagments thereof, are useful in diagnostic applications, wherein the 
presence or amount of the nucleic acid or the protein are to be assessed. These materials are 
further useful in the generation of antibodies that bind immunospecifically to the novel 
substances of the invention for use in therapeutic or diagnostic methods. These antibodies 
may be generated according to methods known in the art, using prediction from 
hydrophobicity charts, as described in the "Anti-NOVX Antibodies" section below. The 
disclosed NOVIO protein has multiple hydrophilic regions, each of which can be used as an 
immunogen. In one embodiment, a contemplated NOVIO epitope is from about amino acids 
40 to 50. In another embodiment, a contemplated NOVIO epitope is from about amino acids 
51 to 55. In other specific embodiments, contemplated NOVIO epitopes are from about 
amino acids 100 to 102, 105 to 106, 130 to 132, 140 to 143, 160 to 165, 190to2I5, 240 to 
265, 290 to 295, 330 to 340, 370 to 373, 410 to 440 and 470 to 490. 

NOVll 

The disclosed NOVl 1 nucleic acid (designated as CuraGen Acc. No. CG57024-01), 
encodes a novel Myeloid Upregulated Protein-like protein and includes the 1408 nucleotide 
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sequence (SEQ ID NO:29) shown in Table 1 1 A. An open reading frame for the mature 
protein was identified beginning with an ATG codon at nucleotides 153-155 and ending with 
a TGA codon at nucleotides 1 1 85-1 1 87. Putative untranslated regions downstream from the 
termination codon and upstream from the initiation codon are underlined in Table 1 1 A, and 
the start and stop codons are in bold letters. 



Table llA. NOVll Nucleotide Sequence (SEQ ID NO:29) 

AGCAGAGAGGCTGCCCTGCTGC7U^TGTCACCGTCGTCACTGCCTCTGCAGGCTGCAGG<:a.CCTGCCACTACC^ 

AGGACTGAQGGGCCTTGGCCCAGCAGGGACCCCAQGGCCTTGGGGGACTGTGTGAGCTGGAAACGTGGCTGGCCAG 

ATGGGCAGCACCATGGAGCCCCCTGGGGGTGCGTACCTGCACCTGGGCGCCGTGACATCCCCTGTGTGCACAGCCC 

GCGTGCTGCAGCTGGCCTTTGGCTGCACTACCTTCAGCCTGGTGGCCCACCGGGGTGGCTTTGCGGGCGTCCAGGG 

CACCTTCTGCATGGACGCCTGGGGCTTCTGCTTCGCCGTCTCTGCGCTGGTGGTGGCCTGTGAGTTCACACGGCTC 

CACGGCTGCCTGCGGCTCTCCTGGGGCAACTTCACCGCCGCCTTCGCCATGCTGGCCACCerGCTATGCGCGA^ 

OTGCGGTCCTGTATCCGCTGTACTTTGCCCGGCGGGAGTGTTCCCCCGAGCCCGCCGGCTGTGCTGCCAGGGACTT 

CCGCCTGGCAGCCAGTGTCTTCGCCGGGCTCCrCTTCCTGGCCrACGCrrGTGGAGGTGGCCCTGACGCGGGCCC^ 

CCCGGCCAGGTGAGCAGCTATATGGCCACGGTGTCGGGGCTCCTCAAGATCGTCCAGGCCTTCGTGGCCTGC^ 

TCTTCGGGGCGCTGGTCCATGACAGCCGCTACGGGCGCTACGTGGCCACCCAGTGGTGCGTGGCCGTCTAC^ 

GTGCTTCCTGGCCACAGTGGCCGTGGTGGCCCTGAGTGTGATGGGCCACACAGGa 

CGGCTGGTGGTGGTGTACACCTTCCTGGCTGTGCTCCTGTACCrCAGCGCCGCCGTGATCTGGCCAGTC^ 

TOSATCCCAAGTACGGTGAGCCa^CGGCCCCCCAACTGTGCrCGGGGCAGCTGTCCCT 

GTGGCCATCTTCIACCTACGTCAACCTGCTCCTGTACGTaSTTGACCraSCCTACTCCCAGCT 

CQGGCATCTGTGCACTGTGGGCATCTGTGGCACTGGGAGGGAGCCCGGCTGAGGGCGGCCGCT^^ 

GGGTACTGCTTGCCTCTGCTCAAGGGTCCAGTTGCCGAAACTCCTG ACGCCGGGGCCATCAT 

CCAGCTTCTCCTGCACAGAAGCCCAGCCTQGTCCAGCCT^GGAGCTGACCCACTGGCCACCCCT 

GTGGGCAGTGGCACAACAGCCCCTCAGCCCATTGACTGGGCCCCATTGACGTCCTTGAGCAGGAAATAAATQCTGA 

CATTTATACGTACCCTGCCTCTGGACCAGCAGTCTCTTCT 



The nucleic acid sequence of NOV 1 1 maps to chromosome 2, A disclosed NOVl 1 
polypeptide (SEQ ID NO:30) is 344 amino acid residues in length and is presented using the 
one-letter amino acid code in Table 1 IB. The SignalP, Psort and/or Hydropathy results 
predict that NOVl 1 is likely to be localized with a certainty of 0.7480. In alternative 
embodiments, a NOVl 1 polypeptide is located to the plasma membrane with a certainty of 
0.7000, the endoplasmic reticulum (membrane) with a certainty of 0.2000, or the 
mitochondrial inner membrane with a certainty of 0.1000. The SignalP predicts a likely 
cleavage site for aNOV9 peptide between amino acid positions 33 and 34, /.e. at the 
sequence AFG-CT. 



Table IIB. Encoded NOVll Protein Sequence (SEQ ID NO:30) 



MGSTMEPPGGAYIJHLGAVTSPVCTARVLQIJ^GCTTFSLVAHRGGFAGVQGTFCMD 

HGCLRLSWGNFTAAFAMI^TLLCATAAVLYPLYFARRBCSPEPAGCAARDFRIJ^SVFAGLLFLAYAVEVALT 
PGQVSSYI^TVSGLLKIVQAFVACIIFGALVHDSRYGRYVATQWCVAVYSLCFLATVAVVALSVMGHTGGLGCPFD 
RLV\Ar^TFIjAVLLYI*SAAVIWPVFCFDPKYGEPKRPPNCARGSCPWDTST?mWPSSPTSTCSCTSL^ 
RASVHCGHLWHWEQARLRAAAGHRIWVLLASAQQSSCRNS 
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The NOVl 1 amino acid sequence was found to have 92 of 226 amino acid residues 
(40%) identical to, and 127 of 226 amino acid residues (56%) similar to, the 296 amino acid 
residue ptnr:SWISSPROT-ACC:035682 protein from Mus mnsculus (Mouse) (MYELOID 
UPREGULATED PROTEIN) (E = 1 .66'^^). 

NOVl 1 is expressed in at least the lung. Expression information was derived from 
the tissue sources of the sequences that were included in the derivation of the sequence of 
NOVIL 

NOVl 1 also has homology to the amino acid sequences shown in the BLASTP data 
listed in Table 11 C 



Table IIC. BLAST results for NOVll 


Gene Index/ 
Xden^ifier 


Protein/ 
Organism 


Length 
(aa) 


Identity 
(%) 


Positives 
(%) 


Expect 


gi| 12834438 |dbj |BA 
B22911.l| 
{AK003645) 


evidence : NAS- 
hypothetical 
prot ein-put at i v 
e [Mus 
mus cuius] 


153 


110/122 
(90%) 


113/X22 
(92%) 


4e-5X 


gi 1 17482569 ( ref |XP 
_039907.2 I 
(XM__039907) 


hypothetical 
protein 

XP_039907 [Homo 
sapiens] 


322 


106/266 
(39%) 


153/266 
(56%) 


5e-38 


gi 1 8393800 1 ref |NP_ 
058665. ij 
(NM_016969) 


myeloid- 
associated 
differentiation 
marker [Mus 
muscuius] 


296 


92/226 
(40%) 


127/226 
(55%) 


le-29 


gi| 16553192 |dbj |BA 

B71502.l| 

<AK0574 70> 


unnamed protein 
product [Homo 

sapiens] 


245 


74/178 
(41%) 


106/178 
(58%) 


2e-24 


gi 1 17445253 1 ref |XP 
_065813.l| 
(XM 065B13) 


similar to 
hypothetical 
protein SB135 
[Homo sapiens] 


331 


86/243 
(35%) 


127/243 
(51%) 


le-23 



The homology of these sequences is shown graphically in the ClustalW analysis 
shown in Table 1 ID, 



Table IID. ClustalW Analysis of NOVll 



1) NOVll (SEQ ID NO: 30} 

2) gi|l2834438| (SEQ ID NO:210) 

3) gi|l7482569| (SEQ ID NO:211) 

4) gi( 8393800 j (SEQ ID NO:212) 

5) gijl6553192| (SEQ ID NO:213) 
gill7445253 | (SEQ ID NO; 214) 







10 


20 


30 


40 


50 


60 


NOVll 

git 12834438) 






,...|....t 


....|....| 


....|....| 


..,.|....t 


....1 


1 


























1 
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gi 1 17482569 1 

gi 18393800} 
gij 16553192 I 
91(174452531 



NOVll 

9i 1 12834438 I 
gi 1174825691 
gi I 8393800 I 
gi 1 16553192 I 
gi I 17445253 j 



NOVll 

gi 1 12834438 I 
gi 1 174825691 
gi|8393S00 | 
gi 1 16553192 I 
gi I 17445253 | 



NOVll 

gi I 128344381 
gi 1 17482569 1 
gi I 8393800 1 
gi 116553192 I 
gi 1 17445253 [ 



NOVll 

gi 1 12834438 I 
gi 1 17482569 I 
gi|8393800 j 
gi [16553192 
gi I 17445253 



1 - 1 

X 1 

i 1 

1 M2UlQREEKRRTEQGPGLKCSIUiIILPNIRIIYKFRIYTCrLSENTENIJU^CSSNNQTKI^ 60 



1 
1 
1 
1 
1 

61 QTMQMI*KPDLPSVSSSARTi^PVTVTHPHTTTMRiPTV®SSRHlC 

130 140 150 160 170 180 




NOVll 


92 


gi 1 12834438 j 


92 


gi 1174825691 


106 


gi|8393800| 


100 


gij 165531921 


29 


gi 1174452531 


165 



NOVll 


152 


gi| 12834438 1 


129 


gi|l7482569i 


161 


gi| 8393800 [ 


155 


gij 16553192 | 


84 


gij 17445253 | 


220 




310 



320 



330 



340 



350 



NOVll 


212 


gi| 12834438] 


14 0 


gi 1 17482569 j 


221 


gi|8393800| 


215 


gij 16553192] 


144 


gij 17445253 1 


280 




360 



►KYgBPKRPPNCARG|[ 271 

RHPT 153 

IQPRRSRDVSCfl 280 
[SFTPLPSSS®PSTNLIRDIRa1 264 
^OPWO-TRDYSCg 203 
331 



370 



380 390 

....|....|....|....t....|....| 
272 CPWDTSWWWPSSPTSTCSCTSLTSPTPSFSS; 

153 

281 RSHAYYVCaWDRRLAVAILTAINLIiAYVADL^ 

264 PAVQWIQAALWLViyNPTRCVSGTDDWRC 

204 DRNPYLVCIWDRRLAVTNLTAVNLIAYVGDLVYJ 
331 



410 



420 




400 

[CGHLWHWEGARLRAAAGHRIWV 331 

153 

,VFVKV 322 

296 

245 

331 



.VFVKV- 



430 



332 LLASAQGSSCRNS 344 

153 

322 322 

2se 296 

245 245 

331 



I 331 



The protein encoded by NOVl 1 has high homology to mouse myeloid upregulated 
protein. It is a multipass trans-membrane protein. Since myeloid cells are critical players in 
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inflammation and immune responses, this invention is an excellent antibody target to treat 
inflammation and immune disorders or as a diagnostic marker. 

The protein similarity information, expression pattern, cellular localization, and map 
location for the NOVl 1 protein and nucleic acid disclosed herein suggest that this Myeloid 
Upregulated Protein-like protein may have important structural and/or physiological 
functions characteristic of the Mai family. Therefore, the nucleic acids and proteins of the 
invention are useful in potential diagnostic and therapeutic applications and as a research 
tool. These include serving as a specific or selective nucleic acid or protein diagnostic and/or 
prognostic marker, wherein the presence or amount of the nucleic acid or the protein are to be 
assessed. These also include potential therapeutic applications such as the following: (i) a 
protein therapeutic, (ii) a small molecule drug target, (iii) an antibody target (therapeutic, 
diagnostic, drug targeting/cytotoxic antibody), (iv) a nucleic acid useful in gene therapy (gene 
delivery/gene ablation), (v) an agent promoting tissue regeneration in vitro and in vivo, and 
(vi) a biological defense weapon. 

The nucleic acids and proteins of the invention have applications in the diagnosis 
and/or treatment of various diseases and disorders. For example, the compositions of the 
present invention will have efficacy for the treatment of patients suffering from: systemic 
lupus erythematosus, autoimmune disease, asthma, emphysema, scleroderma, allergy, 
ARDS, as well as other diseases, disorders and conditions. 

The novel nucleic acid encoding Myeloid Upregulated Protein-like protein of the 
invention, or fragments tiiereof, are useful in diagnostic applications, wherein the presence or 
amount of the nucleic acid or the protein are to be assessed. These materials are further 
useful in the generation of antibodies that bind immunospecifically to the novel substances of 
the invention for use in therapeutic or diagnostic methods. These antibodies may be 
generated according to methods known in the art, using prediction from hydrophobicity 
charts, as described in the "Anti-NOVX Antibodies" section below. The disclosed NOVl 1 
protein has multiple hydrophilic regions, each of which can be used as an immunogen. In 
one embodiment, a contemplated NOVl 1 epitope is from about amino acids 5 to 90. In 
another embodiment, a contemplated NOVl 1 epitope is from about amino acids 105 to 1 10. 
In other specific embodiments, contemplated NOVl 1 epitopes are from about amino acids 
170 to 180, 230 to 310, 370 to 400, 420 to 430, 450 to 455, 460 to 465, 480 to 485, 510 to 
515, 570 to 580 and 680 to 690. 
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NOV12 

A disclosed NOV12 nucleic acid (designated CuraGen Acc. No. CG57083-01) 
encodes a novel Testicular Serine Protease-like protein and includes the 1113 nucleotide 
sequence (SEQ ID NO: 31) which is shown in Table 12A. An open reading frame was 
identified beginning with an ATG initiation codon at nucleotides 1 -3 and ending with a TGA 
codon at nucleotides 1069-1071. The start and stop codons are in bold letters and the 
untranslated regions are underlined in Table 12A. 



Table 12A. NOV12 Nucleotide Sequence (SEQ ID NO:31) 

ATGGCCGAAGGTGAAGGGGAAGO^GCACATCTTCACATGGTGACGGGAGAGAGAAAGCXSAAG 

TGCTACACACTTTCAAACAACCAGATCrCGACATGGGCTACTGCa^^ 

CCTGCTGATGTTCCCCMGGAGAAAGAGGCCTTCTTGGCACTAGCTCAGCTGCTGACC^ 

CCAGACACTGTAGATGGACAGCTGCCTATGGGGCCTCa^CAGCCGGGCCAGCCAGGTGG^ 

CATCAAGCAAGGTGGACCGGGGTGTCTCCACAGTGTGTGGGAAGCCTAAGGTGGTGGGGAAGATCTATGG 

TGGCCGGGACGCAGCAGCTGGCCAGTGGCCATGGCAGGCCAGCCTGCTCTACrGGGGCTCG 

GGAGCTGTCCTCa.TCGACTCCTGCTGGCTGGTATCAACTACCCACTGCrTTAAATCCCAGGCCCCG 

ACTATCAGGTTCTGTTGGGAAACATCCAACTGTATCATCaAACXICAGCACACCCAGAAG^ 

CCGGATCATCACCCATCCAGACTTTGAGAAGCTCCACCCCTTTGGQAGTGACATTGCCATGTrGCA 

CACCTGCCTATGAACTTCACTTCCTACATTGTCCCTGTCTGCCrrCCCATCCCGGGACATC^ 

GTAACGTGTCCTGTTGGATAACCGGCTGGGGAATGCTCACCGAAGACCm'GTTCTCAG^ 

GGGGCCTCTAGTCTGCTACCTCCCCAGTGCCTGGGTCCTGGTGGGGCTGGCCAGCTGGGGCCTGGACTGC 

CGGCATCCTGCCTACCCCAGCATCrTCACC».GGGTCACCTACTTCMCAACrc 

GGCTCACTCCTCrrrCTGACCCCGCGCTGGCTCCTCACACCTGCTCTCCACCCAAGCCTCTGAGGGCT 

TGGCCTGCCTGGGCCCTGCGCAGCCCTTGTGCTGCCACAGACCTGGCTCCTGCTGCCACTTACCCTCAGG 

GCCCCATGGCAGACCCTGTGATGACCGCAGAGCCCCTCX3ACCCCTTCTCTCTGCTCGGCCTAG 



The nucleic acid sequence of NOV12 maps to chromosome 9 and has 354 of 536 
bases (66%) identical to a gb:GENBANK-ID:AB008910|acc:AB008910.1 mRNA from Mus 
musculus (Mus musculus mRNA for TESPl, complete cds) (E = L4e'^^). 

A disclosed NOV 1 2 polypeptide (SEQ ID NO:32) is 356 amino acid residues and is 
presented using the one letter code in Table 12B. The SignalP, Psort and/or Hydropathy 
results predict that NOV 12 does not have a signal peptide and is likely to be localized to the 
microbody (peroxisome) with a certainty of 0.5783. In alternative embodiments, a NOV12 
polypeptide is located to the lysosome (lumen) with a certainty of 02299 or the 
mitochondrial matrix space with a certainty of 0.1000. 



Table 12B. NOV12 protein sequence (SEQ ID NO:32) 

MAEGEGEASTSSHGTCKSKAKREVIxHTFKQPDIiDMGYCQGVSQVAVVIJ^FPKEKE^ 

TVTOQLPMGPHSRASQVAPETTSSKVDRGVSTVCGKPKVVGKIYGGRDi^GQWPWQASLLYWGSHLCGAV^ 
IDSCWLVSTTHCFKSQAPKNYQVLLGNIQLYHQTQHTQKMSVHRIITHPDFEKLHPFGSDIiOT^LHLP^ 
TSYIVPVC3JPSRDMQI»PSNVSCWITGWGMLTEDLCSQGDSGGPLVCYIiPSAWVLVGLASWGLDCRHPAyPSl 
FTRVTYFINWIDKIMRLTPLSDPAIJ^HTCSPPKPLRAAGLPGPCAALVLPQTTOjLLPLTLRAP 
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The NOV 12 amino acid sequence was found to have 140 of 142 amino acid residues 
(98%) identical to, and 140 of 142 amino acid residues (98%) similar to, the 148 amino acid 
residue ptnr:TREMBLNEW-ACC:CAC 12709 protein jfrom Homo sapiens (Human) 
(BA62C3.1 (SIMILAR TO TESTICULAR SERINE PROTEASE)) (E = 1 .4e'^^). 

NOV 12 is expressed in at least in Testis. Expression information was derived from 
the tissue sources of the sequences that were included in the derivation of the sequence of 
NOV12. 

NOV12 also has homology to the amino acid sequences shown in the BLASTP data 
listed in Table 12C. 



Table 12C. BLAST results for NOV12 


Gene Index/ 
Xdentifier 


Protein/ 
Organism 


Length 
(aa) 


Identity 
(%) 


Positives 
(%) 


Bsqpect 


gi 1 17469644 |ref|X 
P__071013.1 j 
{XM_071013) 


similar to 
bA62C3 . 1 
(similar to 
testicular 
serine protease) 
[Homo sapiens] 


365 


305/372 
(81%) 


307/372 
(81%) 


e-161 


gi|l2314133|emb|C 
AC12709.l| 
(AIj136097) 


bA62C3 . 1 
{similar to 
testicular 
serine protease) 
[Homo sapiens] 


148 


140/142 
(98%) 


140/142 
(98%) 


3e-77 


gi 1 6678293 |ref|NP 
_033381-l| 
(NM 009355) 


testicular 
serine protease 
1 [Mus musculus] 


367 


108/287 


160/287 
(55%) 


3e-49 


gi 1 6678295 lref|NP 
_033382 .1 j 
(NM 009356) 


testicular 
serine protease 
2 [Mus musculus] 


366 


95/276 
(34%) 


135/276 
(48%) 


2e-4i 


gi [6009515 |dbj|BA 
A84941.l| 
(A&018694) 


epidermis 
specific serine 
protease 
[Xenopus laevis] 


389 


86/265 
(32%) 


123/265 
(45%) 


le-37 



The homology of these sequences is shown graphically in the Clustal W analysis 
shown in Table 12D. 



Table ClustalW Analysis of NOV12 

(SEQ ID N0:32) 

(SEQ ID NO: 215) 

(SEQ ID NO:216) 

{SEQ ID NO: 217) 

(SEQ ID N0:218) 

(SEQ ID NO:219) 



10 20 30 40 50 60 



liOV12 1 MAEGEGEASTSSHGIX5REKAKREVLHTFKQPDI,DMGyCQGVSQVAVVI.mFPKBKEAFLA 60 
gi 1 174696441 1 MGYCCjGVSQVAWLLMFPKEKKAPLA 26 



1) NOV12 

2) gi|l7469644| 

3) gi 1 12314133 I 

4) gij6678293t 

5) gi|6678295( 

6) gi[6009515 
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gi 1 12314133 I 
gi I 6678293 | 
gi 166782951 
gi 1 6009515 I 



N0V12 

gi( 17469644 | 
gi 112314133 [ 
gi I 6678293] 
gi 1 6678295 I 
gi 16009515 1 



N0V12 



174696441 

123141331 

6678293) 

66782951 

6009515 



N0V12 

gi|l7469644| 
gi 1 12314133 I 
gi I 6678293 | 
gi 166782951 
gi 1 6009515 I 



N0V12 

gi 1 17469644 | 
gi 1 12314133 I 
gi I 6678293 | 
gi I 6678295) 
gi I 6009515 I 



NOV12 

gi I 174696441 
gij 12314133 I 
gi I 6678293 I 
gij 667 82951 
gij 6009515 1 



N0V12 

gi|l7469644j 
gi 112314133 j 
gi [6678293 j 
gi 16678295) 
gi 1 6009515 I 



IIOV12 

gi I 17469644 
gij 12314133 
gi 16678293 1 
gi I 6678295 I 
gi 1 6009515 I 



N0V12 

gi [17469644 I 



61 
27 
1 

19 
19 
1 



-MWGSRAQQSGPDRGOACL 18 
-MCGVRAKKSGLSGYGAGL 18 
1 



70 



80 



90 



110 



120 



LBrSKKfP] 



PDTVDGQI^iGPHSRAlQVAPjrTSSICVDRGV; 
PDTVIX3Qxfl^K5PHSR^^V^^S^TSSKVDRGVS■ 



I- LCFSLLHAQDYlgSQTPPPlfNTSLiPRGR VQi ^ 

5PVl|fIHH0 - 




130 



150 



160 



170 



180 




157 

KTSSSFILSSGREFPGPCVCIiL 146 

- - - 54 

- - 111 

114 

59 



190 



200 



210 



220 



230 



240 



157 

147 npdmresigsvcaghlqgfssvctml: 

54 
111 
114 
69 




310 320 330 340 350 360 

....|....|....|.,..l,...|>...h...|....|....|....|....|,...l 

250 CS^g^ 257 

260 («95 266 

134 134 

205 FLQAPFPLLDAEVSLIDEEECTTFFQTPEVSITBYDVIKDDVLCAGDLTNQKSSC^^^ 264 
208 RIPLPNELYEAELIIMSNDQCKGFFPPPVPGSSRSYYIYDDMVCAADYDMSKSIcS^^ 267 
161 PI*ISPKTIOKaEVAIIDSSVCGTMYESSLGYIPDFSFIOEDMVa«3YKEGRIDAc ri88S 220 



370 



380 



390 



40O 



410 



420 




:QKK 311 

>NK 314 

:tnvplivfseegpsva 279 



303 
I 312 
j 14 8 



430 
J 



440 450 460 

SPPKPLR^U^GLPGPCA^^LgQ- 
'CSPPKPLRAAGLPGPCAA^gQ- 



4 70 



480 



-twl: 



311 

314 

280 PSIGPS; 




EMASSLRG- -WGNYSAGI 
PGSPENENPEGNNKNQG' 
^GVASmSgrBAQSVNSIi 



490 



346 

355 P] 




500 



510 



345 
354 
148 

ISTtS 3 51 
VCTaB 355 
iKTNSTTIFETEAMSMSNNTrg 339 

530 




356 
365 
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gi|X2314133| 148 - - X48 

gi|6678293| 352 LsIqALLL^^WLRIL 367 

g-i 16678295 I 356 LiJqTU:.Q^3 366 

gi I 6009515 I 340 NEiFSLVSSBSTM.RINETKTIDNE&QIHACSrJiTIMiTLIYLFIRF^ 389 



Tables 12E and 12F list flie domain descriptions from DOMAIN analysis results 
against NOV 12. This indicates that the NOV 12 sequence has properties similar to those of 
other proteins known to contain these domains. 



Table 12E. Domain Analysis of NOV12 



gnl I Smart I smart 00 020, Tryp_SPc, Trypsin- like serine protease; Mciny of these are 
synthesised as inactive precursor zymogens that are cleaved during limited 
proteolysis to generate their active forms. A few, however, are active as single 
chain molecules, and others are inactive due to substitutions of the catalytic 

triad residues. 

CD-Length =230 residues, 100.0% aligned 
Score = 174 bits (442), Expect = 6e-45 



NOV12 : 114 



Sbjct: 1 



NOV12: 



KIYGGRDAAAGQWPWQASLl4Y-WGSHLCGAVl,IDSCWLVSTTHCFKSQAPKNYQVLI.GNI 

+ +1 I +f 11 fl 1 MM If II !l + H Ik 

RIVGGSEANIGSFPWQVSLQYRGGRHFCGGSLISPRWVLTAAHCVYGSAPSSIRVRLGSH 



172 



60 



173 QLYHQTQHTQKMSVHRI ITHPDFEKLHPFGSDI AMLQLHLPMNFTSY I VPVCLPSRDMQI* 232 

1 + II + I + 1+ + + i+mi 

Sbjct: 61 DLS-SGEBTQTVKVSKVIVHPimjP-STYDNDIALIJCLSEPVTLSDTVRPICLPSSGYWV 118 

N0V12: 233 PSNVSCWITGWG MDTEDLCS 252 

1+ +1 ++I I I +++ I 

Sbjct: 119 PAGTTCTVSGWGRTSESSGSLPDTLQEVNVPIVSNATC3RRAYSGGPAITDM«LCaGGLEG 178 



tTOV12 : 
NO: 220) 

Sbjct: 
NO:22l) 



253 



179 



-QGDSGGPLVCyi*PSAWVI>VGLASWGLD-CRHPAYPSIFTRVTYPINWI 299 (SEQ ID 



limtllil ! ttllh lit I I I 

GKDACQGDSGGPLVCNDPR- WVLVG I VSWGSYGCARPNKPGVYTRVSSYLDWI 



230 (SEQ ID 



Table 12F. Domain Analysis of NOV12 

gnl jpf ain|pf am00089, trypsin. Trypsin. Proteins recognized include all proteins 
in families SI, S2A, S2B, S2C, and SB in the classification of peptidases. 
Also included are proteins that are clearly members, but that lack peptidase 
activity, such as haptoglobin and protein 2 (PRTZ*) . 
CD-Length = 217 residues, 100.0% aligned 



Score 



153 bits (386) , Expect ^ 2e-38 



N0V12 : 115 lYGGRDAAAGQWPWQASLLrYWGSHLCXSAVI^IDSCWLVSTTHCFKSQAPKNYQVLLGNIQL 

I lll+l II HI! II I II II h++ It + ^hll I 

Sb j ct : 1 IVGGREAQAGSFPWQVSLQVSSGHFCGGSLI SENWVLTAAHCVSG- - ASSVRWLGBHNI, 



174 



58 



N0V12r 

Sbjct: 

NOV12 :• 

Sbjct: 

NOV12: 

Sbjct: 
NO:223) 



175 YHQTQHTQKMSVHRIITHPDFEKI>HPFGSDIAMLQLHLPMNFTSYIVPVCLPSRDMQI.PS 234 

II I ^11 lk+ 1 +lll + hl 1+ + IHIII 11 

59 GTTEGTEQKFDVKKIIVHPNYN PDTNDIALLKLKSPVTIiGDTVRPICLPSASSDIiPV 115 

235 NVSCWITGWG MLTEDLCS QG 254 

+1 +++ + I II 

116 GTTCSVSGWGRTKNLGTSDTLQEVVVPrVSRETaRSAYGGTVTDTMICAGALGGKDACQG 175 

255 DSGGPLVCYLPSAWVLVGLASWGLDCRHPAYPSIFTRVTYFINWI 299 (SEQ ID NO: 222) 

llllllil I 111+ III I II ++|[h 

176 DSGGPLVC SDGELVGIVSWGYGCAVGNyPGVYTRVSRYXDWI 217 (SEQ ID 
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Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being 
found in viruses, bacteria and eukaryotes. They include a wide range of peptidase activity, 
including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 
20 families (denoted SI - S27) of serine protease have been identified, these being grouped 
into 6 clans (SA, SB, SC, SE, SF and SG) on the basis of structural similarity and other 
functional evidence. Structures are known for four of the clans (SA, SB, SC and SE): these 
appear to be totally unrelated, suggesting at least four evolutionary origins of serine 
peptidases and possibly many more. See Interpro (IPR001254). 

Notwithstanding their different evolutionary origins, there are similarities in the 
reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C 
clans have a catalytic triad of serine, aspartate and histidine in common: serine acts as a 
nucleophile, aspartate as an electrophile, and histidine as a base. The geometric orientations 
of the catalytic residues are similar between families, despite different protein folds. The 
linear arrangements of the catalytic residues commonly reflect clan relationships. For 
example the catalytic triad in the chymotrypsin clan (SA) is ordered HDS, but is ordered 
DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC). 

The trypsin family is almost totally confined to animals, although trypsin«Iike 
enzymes are found in actinomycetes of the genera Streptomyces and Saccharopolyspora, and 
in the fungus Fusarium oxysporum. The enzymes are inherently secreted, being synthesised 
with a signal peptide that targets them to the secretory pathway. Animal enzymes are either 
secreted directly, packaged into vesicles for regulated secretion, or are retained in leukocyte 
granules. 

The protein similarity information, expression pattern, cellular localization, and map 
location for the NOV 12 protein and nucleic acid disclosed herein suggest that this Testicular 
Serine Protease-like protein may have important structural and/or physiological functions 
characteristic of the trypsin family. Therefore, the nucleic acids and proteins of the mvention 
are useful in potential diagnostic and therapeutic applications and as a research tool. These 
include serving as a specific or selective nucleic acid or protein diagnostic and/or prognostic 
marker, wherein the presence or amount of the nucleic acid or the protein are to be assessed. 
These also include potential therapeutic applications such as the following: (i) a protein 
therapeutic, (ii) a small molecule drug target, (iii) an antibody target (therapeutic, diagnostic, 
drug targeting/cytotoxic antibody), (iv) a nucleic acid useful in gene therapy (gene 
delivery/gene ablation), (v) an agent promoting tissue regeneration in vitro and in v/vo, and 
(vi) a biological defense weapon. 
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The nucleic acids and proteins of the invention have applications in the diagnosis 
and/or treatment of various diseases and disorders. For example, the compositions of the 
present invention will have efficacy for the treatment of patients suffering from prostate 
cancer or infertility as well as other diseases, disorders and conditions. 

The novel nucleic acid encoding the Testicular Serine Protease-like protein of the 
invention, or fragments thereof, are useful in diagnostic applications, wherein the presence or 
amount of the nucleic acid or the protein are to be assessed. These materials are further 
useful in the generation of antibodies that bind immunospecifically to the novel substances of 
the invention for use in therapeutic or diagnostic methods. These antibodies may be 
generated according to methods known in the art, using prediction from hydrophobicity 
charts, as described in the "Anti-NOVX Antibodies" section below. The disclosed NOV 12 
protein has multiple hydrophilic regions, each of which can be used as an immunogen. In 
one embodiment, a contemplated NOV 12 epitope is from about amino acids 10 to 25. In 
another embodiment, a contemplated NOV 12 epitope is from about amino acids 70 to 85. In 
other specific embodiments, contemplated NOV12 epitopes are from about amino acids 101 
to 104, 120 to 140, 155 to 205, 240 to 245, 260 to 265, 290 to 298 and 310 to 320. 

NOV13 

One NOVX protein of the invention, referred to herein as NOV13» includes two 
Hepatitis B Virus (HBV) Associated Factor-like proteins. The disclosed proteins have been 
named NOVl 3a and NOVl 3b. 

NOV13a 

A disclosed NOV13a {designated CuraGen Acc. No. CG56961-01), which encodes a 
novel Hepatitis B (HBV) Associated Factor-like protein and includes the 2393 nucleotide 
sequence (SEQ ID NO:33) is shown in Table 1 3A. An open reading frame for the mature 
protein was identified beginning with an ATG initiation codon at nucleotides 157-159 cind 
ending with a TGA stop codon at nucleotides 1687-1689. Putative untranslated regions are 
underlined in Table 13 A, and the start and stop codons are in bold letters. 

Table 13A, NOV13a Nucleotide Sequence (SEQ ID NO:33) 

ACAGCATAATATCAAAACACACAGGGCTCGGGCCGCGCCGGAGGCCACACGGCCTGGCTGAGTTGCTCCTGGT 

CTCCCGCCTCTCCCAGGCGACCCGGAGGTAGCATTTCCCAGGAGGCACGGTCCCCCCCAGGGGGATGGGCACA 

GCCACGCCAGA TGGACGAGAAGACCAAGAAAGCAGAGGAAATGGCCCTGAGCCTCACCCGAGCAGTGGCGGGC 

GGGGATGAACAGGTGGCAATGAAGTGTGCCATCTGGCTGGCAGAGCAACGGGTGCCCCTGAGTGTGCAACTGA . 

AGCCTGAGGTCTCCCCAACGCAGGACATCAGGCTGTGGGTGAGCGTGGAGGATGCTCAGATGCACACCGTCAC 

CATCTGGCTCACAGTGCGCCCTGATATGACCGTGGCGTCTCTCAAGGACATGGTTTTTCTGGACTATGGCTTC 

CCACCAGTCTTGCAGCAGTGGGTGATTGGGCAGCGGCTGGCACGAGACCAGGAGACCCTGCACTCCCATGGGG 

. • 93 . 



TGCGGCAGAATGGGGACAGTGCCTACCTCTATCTGCTGTCaGCCCGCAACACCrCCCT 

GCAGCGGGAGCGGCAGCTGCGGATGCTGGAAGATCTGGGCTTCAAGGACCTCaCGCTGCAGC^ 

CTGGAGCCAGGCCCCCX:MAGCCCGGGGTCCCCCTGGAACCCGGACGGGG6CAGCa^TGCAGTG^ 

CCCCACCGGTGGGCTGGCAGTGCCCCGGGTGCT^CerTCATCAACAAGCCO^CGCGGCCTGGCTCT 

CTGCCGGGCGCGCCCCGAGGCCTACCAGGTCCCOSCCTCATACCAGCCOSACGAGGAGGAGCGAGCGCGCCTG 

GCGGGCGAGGAGGAGGCGCTGCGTCAGTACCAGCAGCGGAAGCAGCAGCAGCaGGAGGGGAACTACCTGCa 

ACGTCCAGCTGGACCAGAGGAGCCTGGTGCTGAACACGGAGCCCGCCGAGTGCCCCGT6TGCTACTCGGTGCT 

GGOKrCCXSGCGAGGCCGTGGTGCTGCGTGAGTGTCTGCACACCTTCTGCAGGGAGTGCCTGCA® 

CGCAACAGCCAGGAGGCGGAGGTCTCCTGCCCCTTCATTGACaAa^CCTACTC^ 

AGAGGGAGATCAAGGCGCTCCTGACCCCTGAGGATTACCAGCGATTTCTAGACCTGGGCATCTCCATT6CTGA 

AAACCGCAGTGCCTTCAGCTACCATTGCAAGACCCCAGATTGCAAGGGATGGT^^ 

AATGAGTTCACCTGCCCTGTGTGTTTCCACGTCAACTGCCTGCrCTGCAAGGCCATCCATGAGC^ 

6CAAGGAGTATCAGGAGGACCTGGCCCTGCGGGCrCAGAACGATGTGGCTGCCCGGCAGAa3ACAGAGAT(^ 

GAAGGTGATGCTGCAGCa.GGGCGAGGCCATGCGCTGCCCCCAGTGCCAGATCGTGGTAC7^GAAGA^ 

TGCGACTGGATCCGOTGCACCCTCTGCCACACCGAGATCT6CTGGGTCACCAAGGG 

GGGGCCCAGGAGACTlCCTlGCGGGGGCTGCaSCTGTAGGGTAAATGGGATTCCTTGCCACCC^ 

CTGCCaiCTGl ^K:TAAAGATGGTGGGQCCACATGCrGACCCAGCCCCAaVTCCAC^ 

CAGGGAGCTTCGTGGACGGCCTTGCTTGCTCTAGCGTTGTAGGGGTCCTGCCTGCACTGC^ 

CACATCTGCCCCAGTGCCrTTGTCCTTCCCI^GGGGCTTGCa^ 

TCTGCCTGACXICXLAGCCTTAAACATAGCCCCTGGCTAGAGGCC^ 

ACTCCTCCCaCCACAACACTCATCTCTCAAACACCAAGCACTCTCAGCCTCCCCGCCTTC^^ 

CTGGGGCTAACTTCTCTGCCTTTGTGGTTGGAGGCCTGAGGCCrCrrTGG^ 

CAGGAAGGAGACTGCACAGTTTTGAAAGCACAGCCCGTCAGGTCCGGCTglX^ 

TGTAAGCTATTATAATTAAAATGGTTTTCCGGGAAGGGATGAGTGTGATGTCCTTGAGAGGAAATGAATGCCC 
TGGCCTGGGACTCTACACACAGGCAGGATCCTGAGGTCTCTGGGAACTGCATCAGAAAGTTGACTTGTCAGTC 
CATCTGTGGTAGAATGAGGCTGTGACTGAGCACTGGGACCTTTCTACCAGATGTGGC 



The disclosed NOV 13a nucleic acid sequence maps to chromosome 20 and 1894 of 
1900 bases (99%) identical to a gb:GENBANK-ID:HSU67322|acc:U67322.1 mRNA from 
Homo sapiens (Human HBV associated factor (XAP4) mRNA, complete cds) (E = 0.0). 

A disclosed NOV13a polypeptide (SEQ ID NO:34) is 510 amino acid residues in 
length and is presented using the one-letter amino acid code in Table 13B. The SignalP, 
Psort and/or Hydropathy results predict that NOV13a does not have a signal peptide and is 
likely to be localized to the cytoplasm with a certainty of 0.4500. In alternative 
embodiments, a NOV 13a polypeptide is located to the microbody (peroxisome) with a 
certainty of 0.3000, the mitochondrial matrix space with a certainty of 0.1000, or in the 
lysosome (lumen) with a certainty of 0.1000* 



Table Encoded NOV13a Protein Sequence (SEQ ID NO:34) 

MDEKTiEciiEMALSLTRA^ 

RPDMTVASLKDWFLDYGFPPVI,<^W2GQRIA3?DQETLHSHGVRQNGDSAYLYL 

MLBDLGFKDLTLQPRGPLEPGPPKPGVPQEPGRGQPDAVPEPPPVGWQCPGCTFINKPTRPGCEMCCRARPEAYQ 
VPASYQPDEEERARIJlGEEEALRQyQQRKQQQQEGNYLQHVQLDQRSLVLNTEPAECPVCySVLAPGEAVVLREC 
LHTFCRECLQGTIRNSQEAEVSCPFIDNTYSCSGKLLEREI KAIiJuTPEDYQRFLDLGI S lAENRSAFS YHCKTPD 
CKGWCFFEDDVNEFTCPVCFHVNCLLCKAIHEQMWCKEYQEDIJiLRAQl^VAARQTTEMLKVMLQ^ 
QIWQKKTCrowiRCTVCHTEICWVTKGPRWGPGGPGDTSGGCRCRVNGIPCHPSCQNCH 



The NOV13a amino acid sequence was found to have 457 of 464 amino acid residues 
(98%) identical to, and 459 of 464 amino acid residues (98%) similar to, the 468 amino acid 
residue ptnr:SPTREMBL-ACC:095623 protein from Homo sapiens (Human) (HBV 
ASSOCIATED FACTOR) (E = 9.4e-^^^). 
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NOVlSa is expressed in at least the liver. Expression information was derived from 
the tissue sources of the sequences tfiat were included in the derivation of the sequence of 
NOV 13a. 

Possible small nucleotide polymorphisms (SNPs) found for NOV 13a are listed in 
Tables 13C and 13D. 



Table 13C: SNPs 


Consensus Position 


Depth 


Base Change 


PAF 


1000 


9 


1>G 


0.444 



Table 13D: SNPs 


Variant 


Nucleotide 
Position 


Base Change 


Amino Acid 
Position 


Base Change 


13376998 


1249 


A>G 


365 


Ser>Gly 



NOV13b 

A disclosed NOV13b (designated CuraGen Acc. No. CG56961-02), which includes 
the 2372 nucleotide sequence (SEQ ID NO:35) shown in Table 13E. An open reading frame 
for the mature protein was identified beginning with an ATG codon at nucleotides 1-3 and 
ending with a TGA codon at nucleotides 1666-1668. The start and stop codons of the open 
reading frame are highlighted in bold type. Putative untranslated regions are underlined. 



Table 13E. NOV13b Nucleotide Sequence (SEQ ID NO:35) 

ATGGGCTCGGGGCGCGTCGGAGGCCACACGGCCTGGCTGAGTTGCTCCTGGTCTCCCGCCTCTCCCAGGC 

CGGAGGTAGCATTTCCCAGGAGGCACGGTCCCCCCCAGGGGGATGGGCACAGCCACGCCAGA^ 

CCAAGAAAGCAGAGGAAATGGCCCTGAGCCTCACCCGAGCAGTGGCGGGCGGGGATGAACAGGTG^ 

TGTGCCATCTGGCTGGCAGAGCAACGGGTGCCCCCGAGTGTGCAACTGAAGCCTGAGGTCTCCCXZAACGCyW^ 

CATCAGGCTGTGGGTGAGCGTGGAGGATGCTCAGATGCACACaSTCACCATCTGGCTCA 

TGACCGTGGCGTCTCTCAAGGACATGGTTTTTCTGGACTATGGCTTCCCACCAGTCTTGCAGC^^ 

GGGCAGCGGCTGGCACGAGACCAGGAGACCCTGCACTCCCATGGGGTGCGGCTWSAATGGGGAa^^ 

CTATCTGCTGTCAGCCCGCAACACCTCCCTCAACCCTCAGGAGCTGGAGCGGGAGCGGCAGCTGCGGATGCTGG 

AAGATCTGGGCTTCAAGGACCTCACGCTGCAGCCGCGGGGCCCTCTGGAGCCAGGCCCCCCAAAGCCCGGGGTC 

CCCCAGGAACCCGGACGGGGGCAGCCAGATGCAGTGCCTGAGCCCCCACCGGTGGGCTGGCAGTGCCCCGGGTG 

CACCTTCATCAACAAGCCCACGCGGCCTGGCTGTGAGATGTGCTGCCG6GCGCGCCCCX3AGGCCTACCAGGTCC 

CCGCCTCATACCAGCCCGACGAGGAGGAGCGAGCGCGCCTGGC6GGCGAGGAGGAGGCGCTGCGTCAGTACCAG 

CAGCGGAAGCAGCAGCAGCAGGAGGGGAACTACCTGCAGmCGTCCAGCTGGACCAGAGGAGCCTGCTGCT^ 

CACGGAGCCCGCCGAGTGCCCCGTGTGCTACTCGGTGCTGGCGCCCGGCGAGGCCGTGGTGCTGCXSTGAGTC^ 

TGCACACCTTCTGCAGGGAGTGCCTGCAGGGCACCATCCGCAACAGCCa^GGAGGaSG^ 

ATTGACAACACCTACTCGTGCTCGGGCAAGCTGCTGGAGAGGGAGATCAAGGCGCTCCTGACCCCTGAGGATTA 

CCAGCGATTTCTAGACCTGGGCATCTCCATTGCTGAAAACCGCAGTGCCTTCAGCTACCATTGCaAGACCCm 

ATTGCAAGGGATGGTGCTTCTTTGAGGATGATGTCAATGAGTTCACCTGCCCIXSTGTGTTTCCaCGTC^ 

CTGCTCTGCAAGGCCATCCATGAGCAGATGAACTGCAAGGAGTATCAGGAQGACCTGGCCCTGCGGGCTCAGA^ 

CGATGTGGCTGCCCGGCAGACGACAGAGATGCTGAAGGTGATGCTGCAGCAGGGCGAG6CCATGCGCTGCCCCC 

AGTGCCAGATCGTGGTACAGAAGAAGGACGGCTGCGACTGGATCCGCTGCACCGTCTGCCACACCGAGATCTGC 



95 



TGGGTCACCAAGGGCCCACGCTGGGGCCCTGGGGGCCCMGAGACACCaGCGGGGGCTGCC^ 

TririaATTrrTTnrnArrrAAGCTGTCAGAACTGCCACTG AGCTAAAGATGGTGGG^ 

CCa>CATCCACATTCTGTTAGAj^TGTAGCTCAGGGAGCTTCGTGGACGGCCmX3CTTGCTGTA 

TCCrGCCTGCACTGCGGTTGTCCACGGTCACATCTGCCCCAGTGCCTTTGTCCTTCC(^ 

AGACTTCTCTCCCCTGCGGCTCCCACCTCTGCCTGACCCCAGCCrrAAACa^TAGCCCCT 

TGGGTGGAGCCrCTGTGTGACTCCATACTCCTCCCACCACAACACTCATCTGTCAAACACCAAGCACT 

CTCCCCGCCTTCAGCTGTCAGCTTTCTGGGGCTAACTTCTCTGCCTTTGTGGTTGGAGGCCTGAGGCCTC^ 

AACTCTTGCTAACCTGTTCAGAGCCAGGAAGGAGACTGCACAGTTTTGAAAGCACaG<X:CGTCAGCT 

TGCGTCTCCCTCTCTGCAACCTGTGTAAGCTATTATAATTAAAATGGTTTTCCQGGAAGGGATGAGTGTGATGT 

CCTTGAGAGGAAATGAATGCCCTGGCCTGGGACTCTACACACAGGCAGGATCCTGAGGTCTCTGGGAACTGC^ 

CAGAAAGTTGACTTGTCAGTCCATCTGTGGTAGAATGAGGCrGTGACTGAGCACri^^ 

TGGC - 



The disclosed NOVlSb nucleic acid sequence maps to chromosome 20 and has 1949 
of 1993 bases (97%) identical to a gb:GENBANK-ID:HSU67322|acc:U67322.1 mRNA from 
Homo sapiens (Human HBV associated factor CXAP4) mRNA, complete cds) (E = 0.0). 

A disclosed NOV13b polypeptide (SEQ ID NO:36) is 555 amino acid residues in 
length and is presented using the one-letter amino acid code in Table 13F. The SignalP, Psort 
and/or Hydropathy results predict that NOV 1 3b does not have a signal peptide and is likely to 
be localized to the cytoplasm with a certainty of 0.4500. In alternative embodiments, a 
NOV 13b polypeptide is located to the microbody (peroxisome) with a certainty of 0.3000, 
the mitochondrial matrix space with a certainty of 0.1000, or the lysosome (lumen) with a 
certainty of 0.1 000. 



Table 13F. Encoded NOV13b Protein Sequence (SEQ ID NO:36) 

MGSGRVGGHTAWLSCSWSPASPRRPGGSISQEARSPPGGWAQPRQMDEKTKKAEEMALSLTRAVAGGDEQVAMKC 

AIWIAEQRVPPSVQLKPEVSPTQDIRLWSVEDAQMHTVTIWLTVRPDMTVASLKDMVFLDYGFPP^ 

RIJUilX^ETLHSHGVRQNGDSAyLYLLSARNTSI^PQELQRERQLRMLEDLGFKDLTLQPRGPLEPGPPKPGVPQE 

PGRGQPDAVPEPPPVGWQCPGCTFINKPTRPGCEMCCRARPEAYQVPASYQPDEEERARIiAGEEEALRQYQQRKQ 

QQQEGNYLQHVQLDQRSLVI^EPAECPVCYSVLAPGEAVVLRECLHTFCRECLQGTIRNSQEAEVSCPFID 

SCSGKLLEREIKALLTPEDYQRFLDLGISIAENRSAFSYHCKTPDCKGWCFFEDDVNEFTCPVCFHVNCLIXin^ 

HEQI^CKEYQEDIALRAQNDVAARQTTEMLKVMLQQGEAMRCPQCQIVVQKKTC 

WGPGQPGDTSGGCRCRVNGIPCHPSCQNCH 



The NOV 13b amino acid sequence was found to have 499 of 500 amino acid residues 
(99%) identical to, and 499 of 500 amino acid residues (99%) similar to, the 500 amino acid 
residue ptnr:TREMBLNEW-ACC:CAC28312 protein from Homo sapiens (Human) 
(DJ852M4.L2 (HBV ASSOCIATED FACTOR (ISOFORM 2))) (E - 1 .3e'^^^). 

NOV 13b is expressed in at least the following tissues: adrenal gland, bone marrow, 
brain - amygdala, brain - cerebellum, brain - hippocampus, brain - substantia nigra, brain - 
thalamus, brain -whole, fetal brain, fetal kidney, fetal liver, fetal lung, heart, kidney, 
lymphoma - Raji, mammary gland, pancreas, pituitary gland, placenta, prostate, salivary 
gland, skeletal muscle, small intestine, spinal cord, spleen, stomach, testis, thyroid, trachea 
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and uterus. Expression information was derived from the tissue sources of the sequences that 
were included in the derivation of the sequence of NOV13b. 

NOVBa and NOV13b are very closely homologous as is shown in the amino acid 
alignment in Table 13G. 



Table 13G. Amino Acid Alignment of NOV13a and NOV13b 



kkaesmalsltravaggdeqvamkcaiwlaeqrvpgsvqlkpevsptqdi 
kkaeemalsltravaggdsqvamkcaiwlaeqrvpBsvqlkpevsptqdi 



RLWVSVSDAQMHTVTIWLTVRPDMTVASLKDMVFLDYGFPPVLQQWVIGC 
RLWVSVSDAQMHTVTIWLTVRPDMTVASLKDMVFLDYGFPPVLQQWVIGC 



GFKDLTLQPRGPLEPGPPKPGVPQEPGRGQPDAVP 
GFKDLTLQPRGPLEPGPPKPGVPQEPGRGQPDAVP 



EPPPVGWQCPGCTFI 
EPPPVGWQCPGCTFI 



10 20 30 40 50 

Novi3a mBm 

NOV13b MGSGRVGGHTAWLSCSWSPASPRRPGGSISQEARSPPGGWAQPRC3^Hi 

60 70 80 90 100 

....|,.,, I .... I .... I . ... I ... - I - . . 

HOVlBa 

NOV131> 

110 120 130 140 150 

....|.,.. I .... i .... I .... I .... I .... I I .... I . . .^i 

HOVlSa 

N0Vl3b 

160 170 180 190 200 

.... I .... I .... I .... I .... I . V - I/; - J 

210 220 230 240 250 

I I I I . I I 

NOV13a 
HOVi3b 

260 270 280 290 300 

I I 

NOVl3a ^ 

NOVl3b 

310 320 330 340 350 

....|....|.... I . . rA 

liOVl3a 

NOV13b 

360 370 380 390 400 

,...|....|.... I .... I .... I .... I ... - | ....|....|....| 

KOV13a 

NOVX3b 

410 420 430 440 450 

I .... I .... I .... I .... I .... I ....|. ;..[.... I 

llOV13a 

NOVl3b 

460 470 480 490 500 

^ I .... I .... I .... t .... I 

NOV13a 

NOV3.3b 

510 520 530 540 550 
I I .... I I . . . . 1 . . . . I . . . .! ....{.... I I 

NOVl3a 
2$IOVl3b 



KPTRPGCEMCCRARPSAYQVPASYQPDEEERARLAGSEEALRQYQQRKQ 
sTKPTRPGCEMCCRARPEAYQVPASYQPDEEERARLAGEEEALRQYQQRKQ 



QQQSGNYLQHVQLDQRSLVLNTEPAECPVCYSVL^ 
QQQSGNYLQHVQLDQRSLVLNTEPAECPVCYSVL^ 



3EAWLRECLHTFC 



/LAPGEAWLRECI 



RSCLQGTIRNSQEAEVSCPFIDNTYSCSGKLLEREIKALLTPEDYQRFLDi 
RECLQGTIRNSQEASVSCPFIDNTYSCSGKLLEREIKALLTPSDYQRFLD 



_jGISIASNRSAFSYHCKTPDCKGWCFFEDDVNEFTC 
'■GISIAENRSAFSYHCKTPDCKGWCFFEDDVNEFTC 



:pvcfhvncllcka 

rPVCFHVNCLLCKA 



HEQMNCKEYQEDLALRAQNDVAARQTTEMLKWILQQGEAMRCPQCQIWC 
HEOMNCKEYQEDLALRAQNDVAARQTTEMLKW.LQQGEAMRCPQCQIWC 



KKDGCDWIRCTVCHTSICVrvTTKGPRWGPGGPGDTSGGCRCRVNGIPCHPS 
KKDGCD WI RCT VCKTE I CWVTKGPRWGPGGPGDTSGGCRCRWGIPCHPS 
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NOVlSa 
NOVl3b 



510 
555 



Homologies to any of the above NOV13 proteins will be shared by the other NOV13 
proteins insofar as they are homologous to each other as shown above. Any reference to 
NOV 13 is assumed to refer to both of the NOV 13 proteins in general, unless otherwise noted. 

NOV 13a also has homology to the amino acid sequences shown in the BLASTP data 
listed in Table 13H. 



Table 13H. BLAST results for NOV13a 


Gene Index/ 
Identifier 


Protein/ Organism 


Length 
(aa) 


Identity 
{%) 


Positives 
(%) 


Ea^ect 


gi| 15929590 |gb|AAH 
15219. l|AAH15219 
(BC015219) 


HBV associated 
factor [Homo 
sapiens] 


510 


510/510 
(100%) 


510/510 
(100%) 


0.0 


gi 1 14043036 | ref |NP 
_112506.l| 
{NM_031229} 


chromosome 20 
open reading 
frame 18, isoform 
2; HBV associated 
factor [Homo 
sapiens] 


500 


500/500 
(100%) 


500/500 
(100%) 


0.0 


gi 1 5454168 | ref |NP_ 
006453 .l| 
(NM_006462) 


chromosome 2 0 
open reading 
frame 18, isoform 
1; HBV associated 
factor [Homo 
sapiens] 


468 


455/455 
(100%) 


455/455 
(100%) 


0.0 


gi 1 9790279 | ref |NP_ 
062679, l| 
(NM_019705) 


xibiquitin 
conjugating 
enzyme 7 
interacting 
protein 3 [Mus 
mus cuius] 


498 


455/500 
(91%) 


472/500 
(94%) 


0.0 


gi (11120718 1 ref iNP 
_068532.ll 
(NM 021764) 


protein kinase C- 
binding protein 
BetalS [Rattus 
norvegicus] 


498 


453/500 
(90%) 


474/500 
(94%) 


0.0 



The homology of these sequences is shown graphically in the ClustalW analysis 
shown in Table 131. 
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Table 131. QustalW Analysis of NOV13 



1) 


NOV13a 


(SEQ 


ID 


NO: 


34) 


2) 


NOV13b 


(SEQ 


ID 


NO: 


36) 


3) 


gi| 15929590 1 


(SEQ 


ID 


NO: 


224) 


4) 


gi |14043036| 


(SEQ 


ID 


NO: 


225) 


5) 


gij 5454168 1 


(SEQ 


ID 


NO: 


226) 


6) 


gi|9790279| 


(SEQ 


ID 


NO: 


227) 


7) 


gi| 11120718 1 


(SEQ 


ID 


NO: 


228) 



NOV13a 


16 


NOV13b 


61 


gi 


159295901 


16 


gi 


14043036) 


6 


gi 


5454168] 


2 


gi 


9790279 1 


6 


gi 


11120718 1 


6 



NOV13a 

NOV13b 

gi 1 15929590 I 

gi 1 14043036 I 

gi I 5454168 I 

gi 197902791 

gi I 11120718 | 



N0V13a 


76 


NOV13b 


121 


gi| 15929590 | 


76 


gi| 14043036 j 


66 


gij 5454168 1 


34 


gi| 9790279 j 


66 


gij 11120718 1 


66 


NOV13a 


136 


NOV13b 


181 


gi| 15929590 | 


136 


gij 140430361 


126 


gij 5454168 I 


94 


gij 9790279 1 


126 


gij 11120718 | 


126 


NOV13a 


196 


NOVl3b 


241 


gi| 15929590 1 


196 


gij 14043036 1 


186 


gij 54541681 


154 


gi j 9790279 j 


184 


gij 11120718 1 


184 


NOV13a 


256 


NOVl3b 


301 


gi| 15929590] 


256 


gi| 14043036 | 


246 


gij 5454168 1 


214 


gi i 9790279 j 


244 


gij 111207181 


244 



10 



20 



30 



40 



50 

--MDEKTKKAEj 

MGSGRVGGHTAWLSCSWSPASPRRPGGSISQEARSPPGGWAQPRQMDEKTKKAEEij 

MDEKTKKAE] 





13 0 
. . I . . 



14 0 



150 



160 



170 



180 



1. 



RPDMTVASLKDMVFLDYGFPPVLQQWVIGQRLARDQETLHSHGVRQNGDSAYLYLLSARN 
RPDMTVASLKDMVFLDYGFPPVIiQQWIGQRLARDQETLHSHGVRQNGDSAYLYLLSARK 
RPDMTVASLKDr>^^/FLDYGFP?VLQQWVIGQRLARDQETLHSHGVRQNGDSAYLYIiLSARN 
RPDMTVASLKDMVFLDYGFPPVLQQ'^JVIGQRLARDQETLHSKG^/RQNGDSAYLYLLSARK 
RPDMTVASLXDMVFLDYGFPPVLOQWIGQRLARDQETLHSKGVRQNGDSAYLYLLSARK 
RPDMTVASLKDrWFLDYGFPpgLQQv^JVlGQRLARDQETLHSHGpR^'GDgAYLYLLSAR^^ 



:.KDMVFLDYGF] 



190 



JlTLHSHGllRiiNGDS 



135 

180 

135 

125 

93 

125 

125 



210 



220 



230 



240 




250 



260 



270 



280 



I 



290 
..1 



I- 



300 



jGWQCPGCTFINKPTRPGCSMCCRARPEAYQVPASYQPDEEERARLAGEEEALRQYQQRKC 

pwqcpgctfinkptrpgcsmccrarpeayqvpasyqpdseerarlagesealrqyqqrkq 
|gwqc?gctfinkptrpgcsmccrarpeayqvpasyqpdeeerarlageeealrqyqqrkq 
:gwqc?gctfinkptrpgcemccrarpeayqvpasyqpdeeerarlagssealrqyqqrkq 

:GWQCPGCTFINKPTRPGCEMCCRAJIPEAYQVPASYQPDEEERARLAGSEEALRQYQQRKQ 
iGWQCPGCTFINKPTRPGCEMCCRARPEjjYQiPASYQPDEESRARLAGSEEALRQYQQRKQ 

gwqcpgctfinkptrpgcemccrarpeayqipasyqpdeeerarlagseealrqyIqrkq 



310 



320 



330 



I 



I 



340 



350 



I 



360 
.-1 



QQQEGNYLQHVQLDQRSLVLNTSPAECPVCYSVLAPGSAWLRECLKTFCRECLQGTIRN! 
QQQEGNYLQHVQLDQRSL^/LNTEPAECPVCYSVLAPGSAWLRECLHTFCRECLQGTIRNI 
QQQEGNYLQHVQLDQRSLVLNTEPAECPVCYSVLAPGEAWLRECLKTFCRECLQGTIRNj 
QQQEG^JYLQHVQLDQRSLVLNTEPAECPVCYSVLA?GEAWI.RECLHTFCRECLQGTIR^^ 
QQQEGNYLQHVQLDQRSLVLNTSPAECPVCYSVLAPGSAWLRECLHTFCRECLQGTIRi\| 
QQQEGNYLQHVQLiQRSL\TLNTEFj5ECPVCYSVLA?GSAV^.^RECLHTFCRECLQGTIRj\^ 
QQQEGNYLQIWQLiQRSL-\/LNTSPA^CPVCYSVLAPGEA^/VLRECLHTFCRECLQGTIRKi 



99 



370 



380 



390 



400 



410 



420 



N0V13a 


316 


N0V13b 


361 


gi 


15929590 1 


316 


gi 


14043036 1 


306 


gi 


5454168 1 


274 


gi 


97902791 


304 


gi 


111207181 


304 



SQEAEVSCPFIDNTYSCSGKLLEREIKALLTPEDYQRFLDLGISIAENRSAFSYKCKTPC 
SQEAEVSCFFIDNTYSCSGKLLERSIKALLTPEDYQRFLDLGISIASNRSAFSYKCKTPD 
SQEAEVSCPFIDNTYSCSGKLLEREIKALLTPSDYQRFLDLGISIASNRSAFSYHCKTPD 
SQHAEVSCPFIDNTYSCSGKLLSREIKALLTPSDYQRFLDLGISIAENRSAFSYKCKTPD 
SQSAEVSCPFIDNTYSCSGKLLERSIKALLTPEDYQRFLDLGISIAENRSAFSYHCKTPD 
SQEAEV|CPFID|TYSC|GKLLEREliALL|PEDYQRFLDLG|SIAENRSgiSYHCKTPD 
QnTraw.qrPPTnT^-TYSriGKLLEREliALLiPEDYORFLDLGisiAE 



3 75 
420 
375 
365 

333 
363 
363 



[ - 



430 



440 



450 
. . I . . 



460 



470 



I ■ 



480 



N0V13a 


376 


N0V13b 


421 


gi 


159295901 


376 


gi 


140430361 


366 


gi 


5454168 1 


334 


gi 


9790279) 


364 


gi 


111207181 


364 



CKGWCFFEDDVNEFTCPVCFKVISfCLLCKAIHEQMNCKSYQEDLALRAQNDVAARQTTEML 

CKGWCFFEDDWEFTCPVCFHVNCLLCKAIHSQMNCKEYQEDLALRAQNDVAARQTTEML 

CKGWCFFEDDVNEFTCPVCFHVKCLLCKAIHEQMNCKEYQEDLALRAQNDVAARQTTEML 

CKGWCFFEDDWEFTCPVCFKVNCLLCKAIHEQMNCKEYQEDLALRAQNDVAARQTTEML 

CKGWCFFEDDVNSFTCPVCFKVNCLLCKAIHEQMNCKEYQEDLALRAQNDVAARQTTEML 

ciGWCF?EDDWEFTC?VCjgVNCLLCKAIKE|MNCjEYQ|DLALRAQNDVA^^ 

iciGWCFFEDDWEFTCPVcfflvNCLLCKAIHElMNciEYQlDLAiSRA^ 



j . 



490 



500 



1 



1 . 



510 



520 



530 



540 



N0V13a 


436 


NOVl3b 


481 


gi 


159295901 


436 


gi 


140430361 


426 


gi 


54541681 


394 


gi 


9790279 1 


424 


gi 


11120718 1 


424 



KVMLQQGEAlMRCPQCQIWQKKDGCDWIRCTVCHTEICWVTKGPRWGPGGPGDTSGGCRC 
KVIvlLQQGSAiMRCPQCQIWQKKDGCDWIRCTVCHTEICWVTKGPRWGPGGPGDTSGGCRC 
KVMLQQGEAMRCPQCQIWQKKDGCDWIRCTVCHTEICWVTKGPRWGPGGPGDTSGGCRC 
KVMLQQGEAJV'IRCPQCQIWQKKDGCDWIRCTVCHTEICWVTKGPRWGPGGPGDTSGGCRC 
lO/MLQQGEAMRCPQCQIWQKKDGCDWIRCTVCHTEICWTKGPRWGPGGPGDTSGGCRC 
KVMLQQGEAM^CPQC^IWQKKDGCDWIRCTVCHTEICWVTKGPRWGPGGPGDTSGGCRC 
iVMLQQGEA.^BcPQCi|lWQKKDGCDWIRCTVCHTSICWVTKGPRWGPGGPGDTSGGCRC 



495 
540 
495 
485 
453 
483 
483 




Tables 13J-K lists the domain description from DOMAIN analysis results against 
NOV13. This indicates that the NOV13 sequence has properties similar to those of other 
proteins known to contain these domains, including the gnl|Load|LOAD_little^fing, 
Iittle_fing, Zinc coordinating RNA binding domain. 
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Table 13 J Domain Analysis of NOVIS 

HMM file: pfamHMMs 

Scores for sequence family classification (score includes all domains) : 



Model 


Description 












Score 


E -value 


zf-RanBP Zn- finger in Ran bind prot & others 




24.3 


0.0028 


zf-C3HC4 


Zinc finger. 


C3HC4 


type 


(RING finger) 


22.3 


1.5e-05 


IBR 


IBR domain 












-19.1 


8.3 


Parsed for domains: 
















Model 


Domain seq 


seq 


hmm 


hmm 




score 


E-value 




from 


to 


from 


to 


























zf-RanBP 


1/1 


194 


222 . . 


1 


32 


El 


24.3 


0.0028 


zf-C3HC4 


1/2 


282 


325 . . 


1 


53 


E. 


26.7 


6.3e-07 


zf-C3HC4 


2/2 


387 


394 . 


46 


54 


-3 


0.7 


63 


IBR 


1/1 


351 


411 . 


1 


72 


El 


-19.1 


8,3 



Alignments of top- scoring domains: 

Zf-RanBP: domain 1 of 1, from 194 to 222: score 24.3, E = 0.0028 

*->ragsdWdCissClvqNfatstkCvaCqapkps<~* (SEQ ID NO: 229) 
NOV13 194 PVG--WQC-PGCTFINKPTRPGCEMCCRARPB 222 (SEQ ID NO: 230) 



zf-C3HC4: domain 1 of 2, from 282 to 325: score 26.7, E = 6.3e-07 

* - >CpICl tTFdldepkpf kepvllpCgHsFCskCivellrlsqnsknnsvykCPl< - * ( SEQ ID NO : 23 1) 

ll + l ++1 ++ II +I + I + II+ +I + II+ + I M + 
NOV13 282 CPVC YSVLAPGEAWXjRECLHTFCRECIiQGTIRNSQEAE VS-CPF 325 (SEQ ID NO: 232) 



zf-C3HC4: domain 2 of 2, from 387 to 3 94: score 0.7, 
*-->nsvykCPlC<-* (SEQ ID NO: 233) 

N0V13 387 NEFT-CPVC 394 (SEQ ID NO: 234) 



IBR: domain 1 of 1, from 351 to 411: score -19.1, B = 8.3 (SEQ ID NO: 235) 

eKYekfmvrsyveknpdlkwCPgpdCsyavrltevssstelaepprVeCkkPaCgtsFCfkCgaeWHapvsC 
+++ +++ +++++ j +++ ++ I+++ I l+l 1+ ++I 

NOV 351 QRFLDLGISIAENRSAFSYHCKTPDCKGWCFFED DVNBF TCPV--CFHVNCLLCKAI-HEQMNC 411 

(SEQ ID NO: 23 6) 



Table 13K Domain Analysis of NOV13 

gnl I Smart I smart 002 13, X3BQ, Ubiquitin homologues; Ubiquitin-mediated 
proteolysis is involved in the regulated turnover of proteins required for 
controlling cell cycle progression 
CD-Length = 72 residues, 83.3% aligned 
Score - 36.2 bits (82), Bxp&ct = 0.005 



NOV13: 70 TIWLTVRPDMTVASLKDMVFLDYGPPPVI,QQWI---GQRLARDQETLHSHGVRQNGDSAY 127 

II I hi 11+ 11+ + I II il +1 i I II H+ l + l + + 

Sbjct: 12 TITLEVKPSDTVSELKEKIADLEGIPPE-QQRLIYKGKVL-EDDRTLAEYGI-QDGSTIH 68 

NOV13: 128 LYL 130 (SEQ ID NO: 237) 



Sbjct: 69 LVIi 71 (SEQ ID NO:238) 



Ran binding-proteins (RanBPs) are putative nuclear-export terminators, and importin- 
beta-like molecules, they are known to bind RanGTP and RanGDP. The RanBP zinc finger 
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found mainly in these proteins bind exclusively RanGDP (Blobel G., Yaseen N-R., 1999, 
Proc. Natl. Acad Sci. U.S.A. 96: 5516-5521). 

The RING-finger is a specialized type of Zn-fmger of 40 to 60 residues that binds two 
atoms of zinc, and is probably involved in mediating protein-protein interactions. There are 
two different variants, the C3HC4-type and a C3H2C3-type, which is clearly related despite 
the different cysteine/histidine pattern. The latter type is sometimes referred to as TRING-HZ 
finger*. 

E3 ubiquitin-protem ligase activity is intrinsic to the RING domain of c-Cbl and is 
likely to be a general function of this domain; Various RING fingers exhibit binding to E2 
ubiquitin-conjugating enzymes (Ubc*s). Several 3D-structures for RING-fingers are known 
[2, 3] . The 3D structure of the zinc ligation system is unique to the RING domain and is 
referred to as the 'cross-brace' motif The spacing of the cysteines in such a domain is C-x(2)- 
C-x(9 to 39)-C-x(l to 3)-H-x(2 to 3)-C-x(2)-C-x(4 to 48)-C-x(2)-C. The way the 'cross- 
brace' motif is binding two atoms of zinc is illustrated in the following schematic 
representation: 

XXX XXX 



X XX X 
XXX 

X XX X 

c c c c 

X \ /X X \ /X 

X 2n X X Zn X 

C/ \C H/ \C 

X XX X 

XXXXXX X XXXXXX 



' C ' : conserved cysteine involved zinc binding. 
'H': conserved histidine involved in zinc binding. 
*Zn*: zinc atom. 

Note that in the older literature, some RING-fingers are denoted as LIM-domains, The 
LIM-domain Zn-finger is a fundamentally different family, albeit with similar Cys-spacing 
(see INTERPRO IPR001781, Freemont, 1993, Ann. N.Y. Acad. Sci. 684: 174-192; Freemont 
and Borden, 1996, Curr. Opin. Struct, BioL 6: 395-401; Freemont et al, 1996, Trends 
Biochem. Sci. 21 : 208-214; Freemont, 2000, Curr. Biol. volumerlO issue:2; Hunter et al, 
1999, Science 286: 309-312; Barinaga, 1999, Science firstpage:223 volume:286 issue:5438). 

Primary cancer of the liver in three brothers was described by Kaplan and Cole (1 965) 

and by Hagstrom and Baker (1968). In these patients there was no recognized preexisting 

liver disease. Denison et al{\91\) described two adult brothers who died of primary 

hepatocellular carcinoma. Both had micronodular cirrhosis with features of subacute 
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progressive viral hepatitis. Australia antigen was demonstrated in the brother in whom it was 
sought. Their father had died much earlier of hepatocellular carcinoma. Familial LCC might 
also have its explanation in alpha-1 -antitrypsin deficiency, hemochromatosis, and 
tyrosinemia. Integration of the hepatitis B virus (HBV) into cellular DNA occurs during 
long-tenn persistent infection in man. Hepatocellular carcinomas isolated from carriers of 
virus often contain clonally propagated viral DNA. Shen et «/. (1991) presented evidence for 
the interaction of uiherited susceptibility and hepatitis B viral infection in cases of primaiy 
hepatocellular carcinoma in eastern China. Complex segregation analysis of 490 extended 
families supported tfie existence of a recessive allele with population frequency 
approximately 0.25, which results in a lifetime risk of HCC in the presence of both HBV 
infection and genetic susceptibility, of 0.84 for males and 0.46 for females. The model 
further predicted that, in the absence of genetic susceptibility, lifetime risk of HCC is 0.09 for 
HBV-infected males and 0.01 for HBV-infected females and that regardless of genotype the 
risk is virtually zero for uninfected persons. 

The finding of small deletions in retinoblastoma and Wilms tumor prompted Rogler et 
al (1 985) to look for the same in association with HBV integration in hepatocellular 
carcinoma. They demonstrated a deletion of at least 13,5 kb of cellular sequences in a liver 
cancer. The HBV integration and the deletion occurred on the short arm of chromosome 1 1 at 
location 1 Ipl4-pl3. The deleted sequences were lost in tumor cells leaving only a single 
copy. Clones of the DNA flanking the deleted segment were used for the mapping of the 
deletion in somatic cell hybrids and by in situ hybridization. Cellular sequences homologous 
to the deleted region were cloned and used to exclude the possibility that this DNA had been 
moved to other positions in the genome. Fisher et al (1987) extended the observations of 
Rogler et al (1985). Using somatic cell hybrids that contained defined 1 Ip deletions, 2 
cloned DNA sequences that flank the deletion generated by a hepatocellular carcinoma (as a 
consequence of hepatitis B vims integration) were mapped to 1 lpl3. Wilms tumor and the 
tumors of Beckwith-Wiedemann syndrome are also determined by changes on 1 Ip. 

Henderson et al (1988) found that unique cellular DNA to the left of an HBy DNA 
integration site cloned from a primary tumor mapped to chromosome 18q (18ql Ll-ql L2), 
whereas right-hand flanking DNA mapped to chromosome 17 at a subterminal region of the 
long arm. In a hepatoma specimen from Shanghai, Zhou et al (1988) identified integration 
of hepatitis B virus into 17pl2-pl L2, which is near the human protooncogene p53. 
Furthermore, the sequence of flanking cellular DNA showed highly significant homology 
with a conserved region of a number of functional mammalian DNAs, including the human 
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autonomously replicated sequence-l (ARSl). ARSl is a sequence of human DNA that allows 
replication of Saccharomyces cerevisiae integrative plasmids as autonomously replicating 
elements in S. cerevisiae cells. Since integration of viral DNA is not a required step in the 
replicative cycle of the hepatitis virus, the presence of integrated HBV sequences in many 
human hepatocellular carcinomas suggests a causal relationship. Since any one of several 
integration sites may lead to the same result, the crucial cellular targets involved in triggering 
liver cell malignant transformation may differ from tumor to tumor. Smith et al (1989) gave 
evidence for microdeletions of chromosome 4q involving the alcohol dehydrogenase 
isoenzyme gene ADH3 and hepatomas from 3 of 5 individuals hetero2ygous for an Xbal 
RFLP detectable by the ADH probe. Two of 7 individuals heterozygous for an epidermal 
growth factor RFLP had lost 1 EGF allele in their hepatoma tissue. 

Agarwal et al (1998) reported a case of severe gynecomastia in a seventeen and one- 
half-year-old boy due to high levels of aromatase expression in a large fibrolamellar 
hepatocellular carcinoma, which caused extremely elevated serum levels of estrone (1200 
pg/mL) and estradiol-! 7 (312 pg/mL) that suppressed follicle-stimulating hormone (FSH) and 
luteinizing hormone (LH) (1.3 and 2.8 lU/L, respectively) and consequently testosterone 
(1.53 ng/mL). After removal of the 1 .5-kg tumor, gynecomastia partially regressed, and 
normal hormone levels were restored. By immunohistochemistry, diffuse intracytoplasmic 
aromatase expression was detected in the liver cancer cells. Northern blot analysis showed 
P450 aromatase transcripts in total RNA from the hepatocellular cancer but not in the 
adjacent liver nor in disease-free adult liver samples. Promoters 1.3 and II were used for P450 
aromatase transcription in the cancer. 

Primary hepatocellular carcinoma occurs at high frequencies in east Asia and sub- 
Saharan Africa. In these areas of the world, chronic infection with the hepatitis B virus is the 
best documented risk factor; however, only 20 to 25% of HBV carriers develop HCC. 
Exposure to the fungal toxin aflatoxin Bl (AFBl) has been suggested to increase HCC risk, 
in part because in vitro experiments demonstrated that AFBl mutagenic metabolites bind to 
DNA and are capable of inducing G-to-T transversions. In certain areas of the HCC endemic 
regions, a mutational hotspot has been reported in the p53 tumor suppressor gene (TP53): an 
AGG-to-AGT transversion (arginine to serine) of codon 249 in exon 7. Microsomal epoxide 
hydrolase (EPHX) and glutathione-S-transferase Ml (GSTMl) are both involved in AFBl 
detoxification in hepatocytes. Polymorphism of both genes has been identified. In Ghana and 
China, McGlynn et al (1995) conducted studies to determine whether mutant alleles at one or 
both of these loci are associated with increased levels of serum AFBl -albumin adducts, with 
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HCC, and with mutations at codon 249 of p53. In a cross-sectional study, they found that 
mmant alleles at both loci were significantly over-represented in individuals with serum 
AFBl albumin adducts. Additionally, in a case-control study, mutant alleles of EPHX were 
significantly over-represented in persons with HCC. The relationship of EPHX to HCC 
varied by hepatitis B surface antigen status, indicating that a synergistic effect may exist. 
Mutations at codon 249 of p53 were observed only among HCC patients with one or both 
high-risk genotypes. These findings by McGlynn et al (1995) supported the existence of 
genetic susceptibility in humans to the environmental carcinogen AFBl and indicated that 
there is a synergistic increase in risk of HCC with the combination of hepatitis B virus 
infection and susceptible genotype. 

Schwienbacher et al (2000) analyzed DN A and RNA fi^om 52 human 
hepatocarcinoma samples and found abnormal imprinting of genes located atllpl5in51% 
of 37 informative samples. The most fi-equently detected abnormality was gain of imprinting, 
which led to loss of expression of genes present on the maternal chromosome. As compared 
with matched normal liver tissue, hepatocellular carcinoma showed extinction or significant 
reduction of expression of one of the alleles of the CDKNIC, SLC22A1L, and IGF2 genes. 
Loss of maternal-specific methylation of the KvDMRl gene in hepatocarcinoma correlated 
with abnormal expression of CDKNIC and IGF2, suggesting a function for KvDMRl as a 
long-range imprinting center active in adult tissues. These results pointed to the role of 
epigenetic mechanisms leading to loss of expression of imprinted genes at 1 lpl5 in human 
tumors. 

See: Agarwal, et al, J. Clin. Endocr. Metab. 83: 1797-1800, 1998. PubMed ID : 9589695; 
Chang, et al. Cancer 53: 1807-1810, 1984. PubMed ID : 6321015; Denison, et al, Ann. 
Intern. Med. 74: 391-394, 1971. PubMed ID : 4324021; Fisher, et al. Hum. Genet. 75: 66- 
69, 1987. PubMed ID : 3026949; Hagstrom and Baker, Cancer 22: 142-150, 1968. PubMed 
ID : 4298178; Henderson, et al. Cancer Genet. Cytogenet. 30: 269-275, 1988. PubMed ID : 
2830013; Kaplan, and Cole, Am. J. Med. 39: 305-31 1, 1965; Lynch, et al. Cancer Genet. 
Cytogenet. 11: 11-18, 1984. PubMed ID : 6317164; McGlynn, e/a/., Proc.Nat. Acad. Sci. 
92: 2384-2387, 1995. PubMed ID : 7892276; Rogler, et al. Science 230: 319-322, 1985. 
PubMed ID : 2996131; Schwienbacher, et al, Proc. Nat. Acad. Sci. 97: 5445-5449, 2000. 
PubMed ID : 10779553; Shen, et al. Am. J. Hum. Genet. 49: 88-93, 1991. PubMed ID : 
1648308; Smith, et al, (Abstract) Cytogenet. Cell Genet. 51: 1081 only, 1989; and Zhou, et 
al, J. Virol. 62: 4224-4231, 1988. PubMed ID : 2845134. 
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The protein similarity information, expression pattern, cellular localization, and map 
location for the NOV 13 protein and nucleic acid disclosed herein suggest that this HBV 
Associated Factor-like protein may have important structural and/or physiological functions 
characteristic of the intracellular family. Therefore, the nucleic acids and proteins of the 
invention are useful in potential diagnostic and therapeutic applications and as a research 
tool. These include serving as a specific or selective nucleic acid or protein diagnostic and/or 
prognostic marker, wherein the presence or amount of the nucleic acid or the protein are to be 
assessed. These also include potential therapeutic applications such as the following: (i) a 
protein therapeutic, (ii) a small molecule drug target, (iii) an antibody target (therapeutic, 
diagnostic, drug targeting/cytotoxic antibody), (iv) a nucleic acid useful in gene therapy (gene 
delivery/gene ablation), (v) an agent promoting tissue regeneration in vitro and in vivo, and 
(vi) a biological defense weapon. 

The NOV 13 nucleic acids and proteins of the invention have applications in the 
diagnosis and/or treatment of various diseases and disorders. For example, the compositions 
of the present invention will have efficacy for the treatment of patients suffering from: Von 
Hippel-Lindau (VHL) syndrome, cirrhosis, transplantation, cancer, hepatitis B as well as 
other diseases, disorders and conditions. 

The novel nucleic acid encoding the HBV Associate Factor-like protein of the 
invention, or fragments thereof, are useful in diagnostic applications, wherein the presence or 
amount of the nucleic acid or the protein are to be assessed. 

These materials are further useful in the generation of antibodies that bind 
immunospecifically to the novel substances of the invention for use in therapeutic or 
diagnostic methods. These antibodies may be generated accordmg to methods known in the 
art, using prediction from hydrophobicity charts, as described in the "Anti-NOVX 
Antibodies" section below. The disclosed NOV 13 protein has multiple hydrophilic regions, 
each of which can be used as an immunogen. In one embodiment, a contemplated NOV13 
epitope is from about amino acids 2 to 3. In another embodiment, a contemplated NOV 13 
epitope is from about amino acids 60 to 70. In other specific embodiments, contemplated 
NOV 13 epitopes are from about amino acids 90 to 92, 1 10 to 120, 125 to 130, 180 to 195, 
200 to 300, 310 to 390, 400 to 410 and 420 to 490. 
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NOV14 

One NOVX protein of the invention, referred to herein as NOV 14, includes two 
Apolipoprotein L-like proteins. The disclosed proteins have been named NOV14a and 
NOV14b. 

NOV14a 

A disclosed NOV14a (designated CuraGen Acc. No. CG57104-01X which encodes a 
novel Apolipoprotein L-like protein and includes the 1233 nucleotide sequence (SEQ ID 
NO:37) is shown in Table 14 A. An open reading frame for the mature protein was identified 
beginning with an ATG initiation codon at nucleotides 10-12 and ending with a TGA stop 
codon at nucleotides 1213-1215. Putative untranslated regions are underlined in Table 14A, 
and the start and stop codons are in bold letters. 



Table 14A. NOV14a Nucleotide Sequence (SEQ ID NO:37) 

AGACGTGGGA TGCACACAGCTCAGAACAGTTGGATCTTGCTCAGTCTCTGTCAGAGGAAGATCCCTTGGA 

CAAGAGGACCCTGCCTTGGTGTGAGAGTGAGGGAAGAGGAAGCTGGAACGAGGGTTAAGGAAAACCTTCC 

AGTCTGGACAGTGACTGGAGAGCTCCAAGGAAAGCCCCTCGGTAACCCAGCCGCTGGCACCATGAACCCA 

GAGAGCAGTATCTTTATTGAGGATTACCTTAAGTATTTCCAGGACCAAGTGAGCAGAGAGAATCTGCTAC 

AACTGCTGACTGATGATGAAGCCTGGAATGGATTCGTGGCTGCTGCTGAACTGCCCAGGGATGAGGCAGA 

TGAGCTCCGTAAAGCTCTGAACAAGCTTGCAAGTCACATGGTCATGAAGGACAAAAACCGCCACGATAAA 

GACCAGCAGCACAGGCAGTGGTTTTTGAAAGAGTTTCCrCGGTTGAAAAGGGAGCTTGAGGATCAC^ 

6GAAGCTCCGTGCCCTTGCAGAGGAGGTTGAGCAGGTCCACAGAGGCACCACCATTGCCAATGTGGTGTC 

CAACTCTGTTGGCACTACCTCTGGCATTCTGACCCTCCTCGGCCTGGGTCTGGCACCCTTCACAGAAGGA 

ATCAGTTTTGTGCTCTTGGACACTGGCATGGGTCTGGGAGCAGCAGCTGCTGTGGCTGGGATTACCTGCA 

GTGTGGTAGAACTAGTAAACAAATTGCGGGCACGAGCCCAAGCCCGCAACTTGGACCAAAGCGGCACCAA 

TGTAGCAAAGGTGATGAAGGAGTTTGTGGGTGGGAACACACCCAATGTTCTTACCTTAGTTGACAATTGG 

TACCAAGTCACACAAGGGATTGGGAGGAACATCCGTGCCATCAGACGAGCCAGAGCCAACCCTCAGTTAG 

GAGCGTATGCCCCACCCCCGCATGTCATTGGGCGAATCTCAGCTGAAGGCGGTGAACAGGTTGAGAGGGT 

TGTTGAAGGCCCCGCCCAGGCAATGAGCAGAGGAACCATGATCGTGGGTGCAGCCACTGGAGGCATCTTG 

CTTCTGCTGGATGTGGTCAGCCTTGCATATGAGTCAAAGCACTTGCTTGAGGGGGCSUVAGTCAGAGTCA 

CTGAGGAGCTGAAGAAGCGGGCTCAGGAGCTGGAGGGGAAGCTCAACTTTCTCACCAAGATCCATGAGAT 

GCTGCAGCCAGGCCAAGACCAATGACCCCAGAGCAGTGCAGCC 

The disclosed NOV 14a nucleic acid sequence maps to chromosome 22ql2 and has 
949 of 1 167 bases (81%) identical to a gb:GENBANK-ID:AF019225|acc:AF019225.1 
mRNA from Homo sapiens (Homo sapiens apolipoprotein L mRNA, complete cds) (E = 



A disclosed NOV14a polypeptide (SEQ ID NO:38) is 401 amino acid residues in 

length and is presented using the one-letter amino acid code in Table 14B. The SignalP, 

Psort and/or Hydropathy results predict that NOV 14a has a signal peptide and is likely to be 

localized to the endoplasmic reticulum (membrane) with a certainty of 0.6850. In alternative 

embodiments, a NOV 14a polypeptide is located to the plasma membrane with a certainty of 

0.6400, the Golgi body with a certainty of 0.4600, or the endoplasmic reticulum (lumen) with 
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a certainty of 0.1000. The Signal? predicts a likely cleavage site for a NOV14a peptide 
between amino acid positions 16 and 17, i.e. at the sequence CQR-KI. 

Table 14B. Encoded NOV14a Protein Sequence (SEQ ID NO:38) 

MHTAQNSWILLSLCQRKIPWXRGPCLGWVREEEAGTRVKEm*PVWTVTGELQGKPLGNPAAGTI^^ 
KYFQDQVSREl^JIJQLLTDDEAmGFVAAAELPRDEADELRKMlNKIlASHM^^^ 
ELEDHIRKIJiAIAEEVEQVHRGTTIANWSNSVGTTSGILT^^ 
VVEI.VNKI.RARAQiU?KITOSGTNVAKVMKEFV^^ 

VIGRISABGGEQVERVVEGPAQJ^SRGTMrVGAATGGIIXLLDWSIJVYESKHLLEGAKSESAEELKKR^ 
NFLTKIHEMLQPGQDQ 

The NOV14a amino acid sequence was found to have 235 of 377 amino acid residues 
(62%) identical to, and 284 of 377 amino acid residues (75%) similar to, the 383 amino acid 
residue ptnr:TREMBLNEW-ACCiAAB81218 protein from Homo sapiens (Human) 
(APOLIPOPROTEIN L-I) (E - 4.6e' ^^). 

NOV 14a is expressed in at least the following tissues: adrenal gland, bone marrow, 
brain - amygdala, brain - cerebellum, brain - hippocampus, brain - substantia nigra, brain - 
thalamus, brain -whole, fetal brain, fetal kidney, fetal liver, fetal lung, heart, kidney, 
lymphoma - Raji, mammary gland, pancreas, pituitary gland, placenta, prostate, salivary 
gland, skeletal muscle, small intestine, spinal cord, spleen, stomach, testis, thyroid, trachea 
and uterus. Expression information was derived from the tissue sources of the sequences that 
were included in the derivation of the sequence of NOV 14a. The sequence is predicted to be 
expressed in the following tissues because of the expression pattern of (GENBANK-ID: 
gb:GENBANK-ID:AF019225|acc:AF01 9225.1) a closely related Homo sapiens 
apolipoprotein L mRNA, complete cds homolog in species Homo sapiens rpancreas. 
Possible small nucleotide polymorphisms (SNPs) found for NOV14a are listed in Table 14C. 



Table 14C: SNPs 


Variant 


Nucleotide 
Position 


Base Change 


Amino Acid 
Position 


Base Change 


13376999 


746 


OT 


246 


Arg>Cys 



NOV14b 

A disclosed NOV14b (designated CuraGen Acc. No. CG571 04-02), which includes 
the 1232 nucleotide sequence (SEQ ID NO:39) shown in Table 14D. An open reading frame 
for the mature protein was identified beginning with an ATG codon at nucleotides 9-11 and 
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ending with a TGA codon at nucleotides 1212-1214. The start and stop codons of tihe open 
reading frame are highlighted in bold type. Putative untranslated regions are underlined. 



Table 14D. NOV14b Nucleotide Sequence (SEQ ID NO:39) 

GACGTGGGA TGCACATAGCTCAGAACAGTTGGATCTTGCrC^GTCrCTCTCa^QAGG?^Q^ 

GACCCTGCCTTGGTGTGAGAGTGAGGGAAGAGGAAGCTGGAAC6A6GGTTAAGGAAAACCTTCCAGTCTGGACAG 

TGACTGGAGAGCTCCT^GGAAAGCCCCTCGGTAACCCAGCCGCTGGCACCATGAACCCAGAGAGO^ 

TTGAGGATTACCTTAAGTATTTCCAGGACCAAGTGAGCAGAGAGAATCT6CTAC3^CTGCTGACTGATGATGAAG 

CCTGGAATGGATTCGTGGCTGCTGCTGAACTGCCCAGGGATGAGGCAGATGAGCTCCGTAAAGCTCTGAACAAG^ 

TTGCAAGTCACATGGTCATGAAGGACAAAAACCGCCACGATAAAGACCAGCAGCACAGGC^ 

AGTTTCCTCGGTTGAAAAGGGAGCTTGAGGATCAGATAAGGAftGCTCCGTGCCCTTGCSVGAGGAGGTT 

TCCACAGAGGCACCACCATTGCCAATGTGGTGTCCaUVCTCrcTTGGCACTACCT 

GCCTGGGTCTGGCACCCTTCACAGAAGGAATCAGTTTTGTGCTCTTGQACACTG6 

CTGCTGTGGCTGGGATTACCTGCAGTGTGGTAGAACTAGTAAACAAATTGCGGGCACGA6CCCAA6CCCGCAACT 

TGGACCAAAGCGGCACCAATGTAGCAAAGGTGATGAAGGA6TTTGTGGGTGGGAACACACCCAATGTTCTTACCT 

TAGTTGACAATTGGTACCAAGTCACACAAGGGATTGGGAGGAACATCCGTGCCATCAGACGAGCC^ 

CTCAGTTAGGAGCGTATGCCCCACCCCCGCATGTCATTGGGCGAATCTCAGCTGAAGGCGGTGAACAGGTTGAGA 

GGGTTGTTGAAGGCCCCGCCCAGGCAATGAGCAGAGGAACCATGATCGTGGGTGCAGCCACTGGAGGCATCT^ 

TTCTGCTGGATGTGGTCAGCCTTGCATATGAGTCAAAGCa^CTTGCTTGAGGGGGCAAAGTCAGAGTCA 

AGCTGAAGAAGCGGGCTCAGGAGCTGGAGGGGAAGCTCAACTTTCTCACCAAGATCCa^TGA^ 

GCCAAGACCAATGACCCCAGAGCAGTGCAGCC 

The disclosed NOV14b nucleic acid sequence maps to chromosome 22ql2 and has 
975 of 1200 bases (81%) identical to a gb:GENBANK-ID:AF0192251acc:AF019225.2 
mRNA from Homo sapiens (Homo sapiens apolipoprotein L-I mRNA, complete cds) (E = 
3,6e-^^^). 

A disclosed NOV14b polypeptide (SEQ ID NO:40) is 401 amino acid residues in 
length and is presented using the one-letter amino acid code in Table 14E. The SignalP, Psort 
and/or Hydropathy results predict that NOV 14b has a signal peptide and is likely to be 
localized to the endoplasmic reticulum (membrane) with a certainty of 0.6850. In alternative 
embodiments, a NOV14b polypeptide is located to the plasma membrane with a certainty of 
0.6400, the Golgi body with a certainty of 0.4600, or the endoplasmic reticulum (lumen) with 
a certainty of 0.1000. The SignalP predicts a likely cleavage site for a NOV14b peptide 
between amino acid positions 14 and 15, le. at the sequence SLC-QR. 



Table 14E. Encoded NOV14b Protein Sequence (SEQ ID NO:40) 

MHIAQNSWILLSLCQRKIPWTRGPCLGVRVREEEAGTRVKENLPVWTVTGELQGKPLGNPAAGTMNPESSI 
IJCYFQDQVSRENLLQLLTDDEAWNGFVAAAELPRDEADELRKAIJSrKIASHMVMK^ 
KREI^DHIRKLRAIJ^EVEQVHRGTTIAOTVSNSVGTTSGILTLIX?LGLAPFTEGISFVI^^ 
TCSVVELWKLRARAQARNIJ)QSGTNVAKVMKEFVGGlsrrPNVI/TLVDN^ 

PPPHVIGRISAEGGEQVERVVEGPAQAMSRGTMIVGAATGGILIJL.IJ3VVSLAYESKHLIiEGAKSE 
UBGKLNFLTKIHEMLQPGQDQ 

The NOV14b amino acid sequence was found to have 336 of 337 amino acid residues 

(99%) identical to, and 337 of 337 amino acid residues (100%) similar to, the 337 amino acid 
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residue ptnr:SWISSNEW-ACC:Q9BQE5 protein from Homo sapiens (Human) 
(Apolipoprotein L2 (Apolipoprotein L-II) (ApoL-II)) (E = L3e'^^). 

NOV14b is expressed in at least the following tissues: adrenal gland, bone marrow, 
brain - amygdala, brain - cerebellum, brain - hippocampus, brain - substantia nigra, brain - 
thalamus, brain -whole, fetal brain, fetal kidney, fetal liver, fetal lung, heart, kidney, 
lymphoma - Raji, mammary gland, pancreas, pituitary gland, placenta, prostate, salivary 
gland, skeletal muscle, small intestine, spinal cord, spleen, stomach, testis, thyroid, trachea 
and uterus. Expression information was derived from the tissue sources of the sequences that 
were included in the derivation of the sequence of NOV 14b. The sequence is predicted to be 
expressed in the following tissues because of the expression pattern of (GENB ANK-ID: 
gb:GENBANK-ID:AF019225|acc:AF019225.2) a closely related Homo sapiens 
apolipoprotein L-I mRNA, complete cds homolog in species Homo sapiens ipancreas. 

NOV14a and NOV14b are very closely homologous as is shown in the amino acid 
alignment in Table 14F. 



\qnswillslcqrkipwtrgpclgvrvr: 
qnswillslcqrkipwtrgpclgvrvr: 



2AGTRVKENLPVWTVTG 
SAGTRVKENLPVWTVTG 



:-FVAAAELPRDEADE 
5FVAAAELPRDEADE 



LRKALNKLASHMVMKDXNRHDKDQQHRQWFLKEFP 
:.RKALNKLASHMVMKDKNRHDKDQQHRQWFLKSFF 



Table 14F. Amino Acid Alignment of NOV14a and NO VI 4b 

10 20 30 40 50 

I .... I .... I .... I . I 

NOV14a 

NOV14b ^Igl 

60 70 80 90 100 

.... I I .... I .... I V- V- 1 - ' - J - • - : 1/ - : lLmi • 

110 120 130 140 150 
[ I I .... I .... I .... I I .... I .... |....| 

NOV14a 

NOV3.4b 

160 170 180 190 200 

....|.... I .... I .... I ... . J-.^^l-^.-l 

Navl4a 

NOV14b 

210 220 230 240 250 

l-^..L 

NOV14a 

IIOV14b 

260 270 280 290 300 

....[■■.. 

NOV14a 

NOV14b 

310 320 330 340 350 

NOV14a 

110 



RLKRELEBKIRKLRALAEEVEQVHRGTTIANV^/SNSVGTTSGILTLLGLG 
RLKRELEDHIRKLRALAEEVEQVKRGTTIA3WVSNSVGTTSGILTLLGLG 



LAPFTEGISFVLLDTGMGLGAAAAVAGITCSWELVHKLRARAQARNLDQ 
r ,A? FTSG I S FVLLDTGMGLGAAAAVAGI TCS V^/EL VNXLRARAQARNLDC 



SGTK^/AKVMKEFVGGNTPiWLTLVDNWyQVTQGIGRNIRAIRRARANPQL 
SGTN^/AKWXEFVGGNTPNVLTLVDNWYQVTQGIGRNIRAIRRARANPQL 



NOV14b 



N0V14a 
NOVl4b 



360 



370 



380 



390 



400 

1 



DWSLAYESKHLLEGAKSESAEELKKRAQELEGKLNFLTKIHSMLQPGQ^ 
DWSLAYESKHLLEGAKSESAEELKKRAQSLSGKLNFLTKIHEMLQPGQI 



NOV14a 
NOV14b 



401 
401 



Homologies to any of the above NOV14 proteins will be shared by the other NOV14 
proteins insofar as they are homologous to each other as shown above. Any reference to 
NOV14 is assumed to refer to both of the NOV14 proteins in general, unless otherwise noted. 

NOV 14a also has homology to the amino acid sequences shown in the BLASTP data 
listed in Table 14G. 



Table 14G. BLAST results for NOV14a 




Gene Index/ 
Identifier 


Protein/ 
Organism 


Length 
(aa) 


Identity 
(%) 


Positives 
(%) 


Expect 


gi |l3325156|gb|A 
AH043 95.1 1AAH043 
95 (BC004395) 


Similar to 
apolipoprotein L 
[Homo sapiens] 


337 


337/337 
(100%) 


337/337 
(100%) 


e-167 


gi|l3562090|ref 1 
NP_1120 92 .ij 
(NM 030882) 


apol ipoprotein 
"L, 2 [Homo 
sapiens] 


337 


336/337 
(99%) 


337/337 
(99%) 


e-167 


gi [5725224 |emb|C 
AB52401.l| 

(Z95114) 
bK212A2.2 


(apolipoprotein 
Jj, 2) [Homo 
sapiens] 


279 


278/279 
(99%) 


279/279 
(99%) 


e-131 


gi|l2408013|gb|A 
AGS3690.l|AF3235 
40 1 (AF323540) 


apolipoprotein 
Li- I [Homo 
sapiens] 


414 


236/383 
(61%) 


285/383 
(73%) 


e-115 


gi|l582447l|gb|A 
AL09358 .l|AF3 054 
28 1 (AF305428) 


apolipoprotein 
LI precursor 
[Homo sapiens] 


398 


237/383 
(61%) 


285/383 
(73%) 


e-115 



The homology of these sequences is shown graphically in the ClustalW analysis 
shown in Table 14H. 



Table 14H. ClustalW Analysis of NOV14 



1) NOV14a 


(SEQ 


ID 


NO : 3 8 ) 


2) NOV14b 


(SEQ 


ID 


NO:40) 


4) gi|l3325156| 


(SEQ 


ID 


NO:239) 


5) giil3562090i 


(SEQ 


ID 


NO:240) 


6) gi|5725224| 


(SEQ 


ID 


NO:241) 


7) gij 12408013 1 


(SEQ 


ID 


NO:242) 


8) gij 15824471 1 


(SEQ 


ID 


NO:243) 


N0V14a 1 






10 20 30 40 50 60 






MHTAQNSWItLSI.CQRKIPWTRGPCLGVRVREEEaGTRVKENl.PVWTVT 49 
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N0Vl4b 
gi|X3325156| 
gi 1 135620901 
gi 15725224 | 
gi 1 12408013 I 
gi 115824471 I 



NOV14a 

NOV14b 

gi 1 13325156 I 

gi 113562090 I 

gi I 5725224 | 

gij 12408013 1 

gi I 15824471 I 



1 MHIAQNSWILLSLCQRKIPOTTIGPCXGVRVREEEAGTRVKENLPVWTVT 49 



gi 1 13325156 1 
gi 1 135620901 
gij 5725224 I 
gij 12408013 I 
gij 15824471 I 



1 — ^ 

1 ^5RFKSHTVELRRPCSDMEGAALIJtVSVI^IWMSAIiFI/3W 60 
1 MEGAAIiIJiVSVLCIWMSALFLGVGVRAEEAGARVQQNVPSOTDT 44 



50 GELQGKPLGNPAAG' 
50 GELQGKPLGNPAAG* 
1 

1 
1 



61 GDPQSKPLGDWAAi 
45 GDPQSKPLGDl 



N0V14a 


110 


NOV14b 


110 


gi| 13325156 1 


46 


gij 135620901 


46 


gij 5725224 | 


1 


gi| 12408013 1 


121 


gij 15824471 j 


105 


N0V14a 


170 


NOV14b 


170 


gi| 133251561 


106 


gij 135620901 


106 


gij 5725224 | 


48 


gi i 12408013 | 


181 


gij 15824471 1 


165 


N0V14a 


23 0 


N0V14b 


230 



166 
166 
108 
241 
225 



NOV14a 


290 


NOV14b 


290 


gi| 133251561 


226 


gi 13562090 


226 


gi 


57252241 


168 




12408013 1 


301 


gi 


15824471 1 


285 


NOV14a 


348 


N0V14b 


348 


gi 


13325156 1 


284 


gi 


13562090j 


284 


gi 


5725224 | 


226 


gi 


12408013 1 


361 


gi 


15824471 j 


345 




! . 



190 



200 



210 



220 



230 



240 



7EQVHRGTTIANWSNSVGTTSGILTLLGLGLAPFTEGISFVLLDTGMGLGAAAAVAGIT 
VEQVHRGTTIAjWVSNSVGTTSGILTLLGLGLAPFTEGISFVLLDTGMGLGAAAAVAGIT 
VSQVKRGTTIMTWSNSVGTTSGILTLLGLGLAPFTEGISFVLLDTGMGLGAAAAVAGIT 

veqvhrgttianwsnsvgttsgiltllglglapftegisfvlldtgmglgaaaavagit 
veqvhrgttianwsnsvgttsgiltllglglapftsgisfvlldtgmglgaaaavagit 

V-^VH|GTTIANV^/s|s||PSGILTL|G|GLA?FTEG|sgyLL^GM^ 



SGILTLiGiGLAPFTEGgsiivLLiiSGMiaLGlEAA^ 



229 
229 
165 
165 
107 
240 
224 



260 



280 



290 



300 




370 



380 



390 



1 , 



400 



410 



LLLDWSLAYESKHLLEGAKSESAEHLKKRAQSLEGKLNFLTKIHEMLQ?GQD< 
LLLDWSLAYESKHLLEGAKSESAEELKKRAQELEGKLNFLTKIHEMLQPGQDQ 
LLLDWSLAYSSKHLLEGAKSESAESLKKRAQSLEGKLNFLTKIHEMLQPGQDQ 
LLLDWSLAYESKHLLEGAKSESAESLKKRAQSLEGKLNFLTKIHEMLQFGQDQ 
LLLDWSLAYESKHLLSGAKSESAEELKKRAQSLEGKLNFL TKIHSM LQPGQDQ 
LiLDV\7gLSYE S KHL JeGAKS ElASELKKSAQELEgKLK^jL4TjT|^^L G^Q^^ 

lIldvvBlRyeskhlIegakse^eelkkSaqeleSklnEd^^^^ 



I 401 
401 
337 
337 
279 
414 
398 



The protein similarity information, expression pattern, cellular localization, and map 
location for the NOV 14 protein and nucleic acid disclosed herein suggest that this 
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Apolipoprotein L-Iike protein may have important structural and/or physiological functions 
characteristic of the Apolipoprotein family. Therefore, the nucleic acids and proteins of the 
invention are useful in potential diagnostic and therapeutic applications and as a research 
tool. These include serving as a specific or selective nucleic acid or protein diagnostic and/or 
prognostic maricer, wherein the presence or amount of the nucleic acid or the protein are to be 
assessed. These also include potential therapeutic applications such as the following: (i) a 
protein therapeutic, (ii) a small molecule drug target, (iii) an antibody target (therapeutic, 
diagnostic, drug targeting/cytotoxic antibody), (iv) a nucleic acid useful in gene therapy (gene 
delivery/gene ablation), (v) an agent promoting tissue regeneration in vitro and in vivo, and 
(vi) a biological defense weapon. 

Epidemiological studies have demonstrated a strong inverse correlation between the 
levels of plasma high density lipoproteins (HDL) and risk of premature coronary heart 
disease (Miller, G. J., and Miller, N. E.,1975, Lancet i, 16-19, Gordon, et al, 1977, J. Am. 
Med. Assoc. 238, 497-499). However, the mechanisms by which HDL protect against 
atherosclerosis need furflier exploration. One proposed protective role of HDL involves 
reverse cholesterol transport, a process in which HDL acquire cholesterol from peripheral 
cells and facilitate its esterification and delivery to the liver. In this process, small, relatively 
lipid-poor HDL particles, termed pre- 1-HDL, have been postulated to be the first acceptors 
of cholesterol from the cells. An additional mechanism may involve the ability of HDL to 
impede the oxidation of other plasma lipoproteins (Glomset, J. A., 1968, J. Lipid Res. 9, 155- 
167; Kunitake, et al, 1987, National Institutes of Health Workshop on Lipoprotein 
Heterogeneity, NIH Publication 87, Vol. 2646, pp. 419-427, National Institutes of Health, 
Rockville, MD; Fielding, C. J., and Fielding, P. E. (1995) J. Lipid Res. 36, 211-228; Castro, 
G. R., and Fielding, C. J. (1988) Biochemistry 27, 25-29; Francone, et al, 1989, J. Biol. 
Chem. 264, 7066-7072; Parthasarathy, et ah, 1990, Biochim. Biophys. Acta 1044, 275-283; 
Kunitake et al, 1992, Proc. Natl. Acad. Sci. U,S.A. 89, 6993-6997; Ohta, T., Takata, K., 
Horiuchi, S., Morino, Y., and Matsuda, L, 1989, FEBS Lett. 257, 435-438). 

Recently, Duchateau et al (1997, J Biol Chem 272 : 25576-82) identified and 
characterized a nev^ protein present in human high density lipoprotein, apolipoprotein L. 
Expression of apolipoprotein L was only detected in the pancreas. The cDNA sequence 
encoding the full-length protein was cloned using reverse transcription-polymerase chain 
reaction. The deduced amino acid sequence contains 383 residues, including a typical signal 
peptide of 12 amino acids. No significant homology was found with known sequences. The 
plasma protein is a single chain polypeptide with an apparent molecular mass of 42 kDa. 
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Antibodies raised against this protein detected a truncated form with a molecular mass of 39 
kDa. Both forms were predominantly associated with immunoaffinity-isolated apoA-I- 
containing lipoproteins and detected mainly in the density range 1.123 < d < 1.21 g/ml. Free 
apoL was not detected in plasma. ApoL-containing lipoproteins (Lp(L)) showed two major 
molecular species with apparent diameters of 12.2-17 and 10.4-12.2 nm in the plasma. 
Moreover, Lp(L) exhibited both pre- and electromobility. 

Mainly associated with apoA-I-containing lipoproteins, apo L is a marker of distinct 
HDL subpopulations. In an effort to gain inference as to its as yet unknown function, 
Duchateau et al (2000, J Lipid Res 41:1231-6) studied the biological determinants of apoL 
levels in human plasma. The distribution of apoL in normal subjects is asymmetric, with 
marked skewing toward higher values. No difference was found in apoL concentrations 
between males and females, but they observed an elevation of apoL in primary 
hypercholesterolemia (10.1 vs. 8.5 microgram/mL in control), in endogenous 
hypertriglyceridemia (13.8 microgram/mL, P < 0.001), combined hyperlipidemia phenotype 
(18.7 g/mL, P < 0.0001), and in patients with type II diabetes (16.2 microgram/mL, P < 0.02) 
who were hyperlipidemic. Significant positive correlations were observed between apoL and 
the log of plasma triglycerides in normolipidemia (0.446, P < 0.0001), endogenous 
hypertriglyceridemia (0.435, P < 0.01), primary hypercholesterolemia (0.66, P < 0.02), 
combined hyperlipidemia (0.396, P < 0.04), hypo-alphalipoproteinemia (0.701, P < 0.005), 
and type II diabetes with hyperiipidemia (0.602, P < 0. 01). Apolipoprotein L levels were also 
correlated with total cholesterol in normolipidemia (0.257, P < 0.004), endogenous 
hypertriglyceridemia (0.446, P = 0.001), and non-insulin-dependent diabetes mellitus 
(NIDDM) (0.548, P < 0.02). No significant correlation was found between apoL and body 
mass index, age, sex, HDL-cholesterol or fasting glucose and glycohemoglobin levels. ApoL 
levels in plasma of patients with primary cholesteryl ester transfer protein deficiency 
significantly increased (7.1 +/- 0.5 vs. 5.47 +/- 0.27, P < 0.006). 

The NOV14 nucleic acids and proteins of the invention have applications in the 
diagnosis and/or treatment of various diseases and disorders. For example, the compositions 
of the present invention will have efficacy for the treatment of patients suffering from: 
premature coronary heart disease, hypercholesterolemia, endogenous hypertriglyceridemia, 
hyperlipidemia, type 11 diabetes, Alzheimer's, dysbetalipoproteinemia, hyperlipoproteinemia 
type III, atherosclerosis, xanthomatosis, premature coronary and/or peripheral vascular 
disease, hypothyroidism, systemic lupus erythematosus, diabetic acidosis, familial 
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amyloidotic polyneuropathy, Down syndrome as well as other diseases, disorders and 
conditions. 

These materials are further useful in the generation of antibodies that bind 
immunospecifically to the novel substances of the invention for use in therapeutic or 
diagnostic methods. These antibodies may be generated according to methods known in the 
art, using prediction from hydrophobicity charts, as described in the "Anti-NOVX 
Antibodies" section below. The disclosed NOV 14 protein has multiple hydrophilic regions, 
each of which can be used as an immunogen. In one embodiment, a contemplated NOV 14 
epitope is from about amino acids 2 to 4. In another embodiment, a contemplated NOV 14 
epitope is from about amino acids 30 to 40. In other specific embodiments, contemplated 
NOV14 epitopes are from about amino acids 60 to 80, 105 to 145, 250 to 260, 270 to 290, 
305 to 330 and 360 to 380. 

NOV15 

A disclosed NOV 15 (designated CuraGen Acc. No. CG57 146-01), which encodes a 
novel Rh type C Glycoprotein-like protein and includes the 1351 nucleotide sequence (SEQ 
ID NO:41) is shown in Table 15A. An open reading frame for the mature protein was 
identified beginning with an CAG initiation codon at nucleotides 1-3 and ending with a TGG 
stop codon at nucleotides 1336-1338. Putative untranslated regions are underlined in Table 
1 5A, and the start and stop codons are in bold letters. 



Table ISA. NOV15 Nucleotide Sequence (SEQ ID NO:41) 

CAGCTGCCCTCCTTCAGGGGGCCAAGTCCCTGGAACTCACCTCCCAGTAGACCGCATCCTCAAAGCAG 
TTCTCATCTGAAGGTTGTCCCCAGAATGGTAATCTCAAAATGAGCCCCACAATGATGCCACCCATCAG 
GGCCATGGCCAGGGTCACCAAGAGACCATAAATCTGGAACTTTCCCTGTGTTCTTGCGGTCCAGTCCC 
CGTTGAAACCTTGAAAGTCAAAGGAATGGACAAGCCCTTCTTTTCCATAGACTTCAAGGCTGGCGGAG 
GCCGCTGTCACAGCACCCACGATGCCGCCTATGATGCCAGGAATGCCATGCAGATTGTTAATGCCACA 
TGTGTCCTGGATGTGCAGCCGGGACTCCAGGAATGGGGTCAGGTATACAAAACCCAGGGTGGAGATGA 
TGCCGCAGACGAAGCC6ATGATGAGGGCACCGTAAGGCAT6AGCATCATCTCAGCAGCGGTACCCACG 
GCCACCCCTCCTGCGAGCGTGGCATTCTGGATGTGCACCATGTCCAGCTTGCCCTTCTTGTGCAGG6C 
ACTGGATATTGCCACCGAGGTAAGCACGCAGGCTGCCAAGGAGCAGTAGGTGTTGATGGCGGCTCGGT 
GCTGGCTGTCCCCATGGTAGGATATGGCTGAGTTGAAGCTGGGCCAGTACATCCACAGGAAGAGGGTG 
CCAATCATGGCAAAGAGGTCCGACTGGTACACAGAATTCTGTCTCTCCTTGCTCTGCTCTAGGTTGCG 
TCGGTAGAGGATCCGGGTCACTGTGAGCCCAAAGTAGGCGCCAAATGTGTGGATGGTCATGGAGCCTC 
CTGCATCCTTCACCTTTAGCAGGTTAAGGAGAATGAACTCATTCACAGCGAAGAGGGTCACTTGGAAG 
AAAGTCATGATGAGCAGCTGAATGGGGCTGACTTTACCCAGAACTGCCCCAAAGGCCACGCAGACAGA 
GGCCACGCAGAAGTCAGCGTTGATGAGGTTCTCCACGCCCACGACGATGTAGCGGTCTTGTAAGAAGT 
GGAACCAGCCCTGCATGAGCAGCGCCCACTGGATGCCGAAGGCTGCCAACAGGAAGTTGAAGCCCAC^ 
GCGCT6AAGCCGTAGCGCTGCAGGAAAGTCATGAGGAAGCCGAAGCCCACGAAGACCATCACGTGCAC 
GTCCTGGAAGCTTGGGTAGCGATAGTAGAATTCGTTCTCCATGTCGCTCAAGTTCTTGTGCGTCCTCT 
CTGACCACCAGTGGGCGTCGGCCTCGAAGTCGTAGCGCACGAACACCCCGAAGAGAATCACCATAATC 
ACCTGCAGGAGCAGGCAGGTGAGCGGCAGCCGCCAGCGGAGGTTGGTGTTCCAGGCCAT 
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The disclosed NOV15 nucleic acid sequence maps to chromosome 15q25 and has 
1319 of 1325 bases (99%) identical to a gb:GENBANK-ID:AF193809|acc:AF193809J 
mRNA from Homo sapiens (Homo sapiens Rh type C glycoprotein (RHCG) mRNA, 
complete cds) (E = 7.8e'^^^). 

The disclosed NOV 15 polypeptide (SEQ ID NO:42) is 445 amino acid residues in 
length and is presented using the one-letter amino acid code in Table 1 5B. The SignalP, 
Psort and/or Hydropathy results predict that NOV 15 has a signal peptide and is likely to be 
localized to the endoplasmic reticulum (membrane) with a certainty of 0.6850. In alternative 
embodiments, a NOV 1 5 polypeptide is located to the plasma membrane with a certainty of 
0.6400, the Golgi body with a certainty of 0.4600, or the endoplasmic reticulum (lumen) with 
a certainty of 0.1000. The SignalP predicts a likely cleavage site for a NOV 15 peptide 
between amino acid positions 32 and 33, i.e, at the sequence VRY-DF. 

Table 15B. Encoded NOV15 Protein Sequence (SEQ ID NO:42) 

MAWNTNLRWRLPLTCLLLQVIMVILFGVFVRYDFEADAHWWSERTHKNLSDMENEFYYRY 
GFGFLMTFLQRYGFSAVGFNFLIJVAFGIQWALLMQGWFHFLQDRYIWGVENLINADFCVASVCVAFGAA^ 
VSPIQLLIMTFFQVTLFAVNEFILL^^^LKVKDAGGSMTIHTFGAYFGLTVTRILYRRNLEQSKERQNSVYQSD 
LFAMIGTLFI.WMYWPS FNS AIS YHGDSQHRAAINT YCSLAACVLTSVAI SS ALHKKGKLDMVHI QNATIAGGV 
AVGTAAEMMMP YGALI IGFVCGI I STLGFVYLTPFLESRI^HI QDTCGINNIJIGI PGI I GGI VGAVTAAS ASI^ 
EVYGKEGLVHSFDFQGFNGDWTARTQGKFQIYGLLVTLAMAUyiGGIIVGLILRLPFWGQPSDENCFEDAVYWE 
VSSRDIAP 

The NOV 15 amino acid sequence was found to have 437 of 438 amino acid residues 
(99%) identical to, and 438 of 438 amino acid residues (100%) similar to, the 479 amino acid 
residue ptnr:SPTREMBL-ACC:Q9UBD6 protein from Homo sapiens (Human) (RH TYPE C 
GLYCOPROTEIN) (E = Z3q^^\ 

NOV 15 is expressed in at least the following tissues: mammary gland, brain, kidney, 
testis. Expression information was derived from the tissue sources of the sequences that were 
included in the derivation of the sequence of NOV15. 

Possible small nucleotide polymorphisms (SNPs) found for NOVl 5 are listed in 
Table 15C. 



Table 15C: SNPs 


Variant 


Nucleotide 
Position 


Base Change 


Amino Acid 
Position 


Base Change 


13377000 


215 


I>G 


72 


Val>Gly 


13377001 


497 


A>G 


166 


Glu>Gly 


13377002 


1205 


T>C 


402 


Leu>Pro 
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NOV 15 also has homology to the amino acid sequences shown in the BLASTP data 
listed in Table 15D. 



Table 15D. BLAST results for NOV15 


Gene Xndex/ 
Identifier 


Protein/ Organism 


Length 
(aa) 


Identity 
{%) 


Positives 
(%) 


Expect 


gi 1 7706683 \ ref |NP_ 
057405. ll 
(NM 016321) 


Rh type C 
glycoprotein [Homo 
sapiens] 


479 


437/438 
(99%) 


438/438 
(99%) 


0.0 


gi 1 979 0197 1 ref I NP_ 
062773 .l| 
(NM 019799) 


Rh type C 
glycoprotein [Mus 
mus cuius] 


498 


354/439 
(80%) 


397/439 
(89%) 


0.0 


gi 1 144 86157 | gb | AAK 
14650. 1| 
(AY013260) 


Rli type C 
glycoprotein [Bos 
tau3rus] 


459 


342/439 
(77%) 


390/439 
(87%) 


0.0 


gi 1 14486163 |gb| AAK 
14653. 1| 
{AY013263) 


Rh type C 
glycopr ot e in 
[Oryctolagus 
cuniculus] 


467 


327/439 
(74%) 


389/439 
(88%) 


0.0 


gi| 10039355 |dbj |BA 
B13346 -l| 
(AB036511) 


50 kD glycoprotein 
[Oryzias latipes] 


488 


272/441 
(61%) 


349/441 
(78%) 


e~159 



5 

The homology of these sequences is shown graphically in the ClustalW analysis 
shown in Table 15E 



1) NOV15 

2) gi [7706683 I 

3) gii9790197[ 

4) gi 1 14486157] 

5) gi 1 14486163 I 

6) gij 10039355 I 



Table 15E. ClustalW Analysis of NOV15 

(SEQ ID NO:42) 
(SEQ ID NO: 244) 
(SEQ ID NO: 245) 
(SEQ ID NO: 246) 
(SEQ ID NO: 247) 
(SEQ ID NO:248) 



NOV15 

gil7706683| 
gi I 9790197 I 
gij 14486157 | 
gi I 14486163 | 
gij 100393551 



N0V15 

gi|7706683 | 
gi I 9790197 | 
gij 14486157 I 
gij 14486163 I 
gij 10039355 I 



NOV15 

gi[7706683| 



1 
1 
1 
1 
1 
1 



48 
48 
48 
48 
48 
61 




70 



80 



90 



100 



110 



120 



I>1eNEFYYRYPS FQDVIH^/IWFVGFGFLMT FLORYGFSAVGFNFLIAAFGI 

dIenefyyrypsfqdvkSmvfvgfgflmtflqrygfsavgfnfllaafgiqwallm: 
dIsnefyyryps fqdvhv'm|fvgfgflmt flqryg|^vgfnf55laafgiqwallm 

DiENEFYYRYPSFQDv7rvWFjGFGFLMTFLQRY^^5GFNFLIJU4lGiQWAL-LM 
D^NEFYBRYPSFQDVKVI^BFVGFGFLMTFL^RYgFSAVGFNFLgAAFQJQWALLM 



130 



140 



150 



160 



170 



180 




|l^GiENLINADFCVASVCVAFGAVLGfO/SP|QLLIMTFFQVTLF|V 
|l^GlFNLINADFCVASVCVAFGAVLGKVSPgGLLIMTFFQVTL.F§V 



gi I 9750197 I 
gi 1 14486157 I 
gx 1 14486163 | 
gi 1 10039355 I 



NOvas 

gi 1 7706683 I 
gi I 97901971 
gi 1 14486157 I 
gi[ 14486163 I 
gi 1 10039355 1 



NOV15 

gi|7706683 | 
gi I 9790197 | 
gi 1 14486157 I 
gi 1 14486163 I 
gij 10039355 1 



NOV15 

gi|7706683 | 
gij 9790197 I 
gi j 14486157 I 
gij 14486163 I 
gij 10039355 1 



NOV15 

gi|7706683 | 
gij 9790197 I 
gij 14486157 i 
gij 14486163 I 
gi 1 10039355 1 



NOV15 



gi 
gi 
gi 
gi 



7706683 I 
9790197 i 
14486157 
14486163 
10039355 




165 
165 
166 
165 
166 
180 



NOV15 


345 


gi|7706683 | 


345 


gi [9790197 | 


346 


gij 14486157 1 


345 


gij 14486163 j 


346 


gi 1 10039355 1 


360 



1 , 



190 

. . t . . 



[ ■ 



200 



210 



220 



230 



240 



NTE FI LLNL LgVKDAGG Sr4T I KrtFGAYFGLTVTgl LY 
[^EFILLNLLg/KDAGGSMTIKjjFGAYFGLTVTglLY 



ME FI LLNLgpKDAGG SMT IHgFG AYFGLTVTWI LYRJ|)TL5^S K^QgSVYB SDLFAN iG 
NE|ILLNL3vKD|GGSMTIH8FGAYFGLTv2'-^7ILYRas^LgisKERQi 
S^EF3LLSLivKDAGGS|TIKHFGAYFGLTVTWILYR^^^^^T.^ifc^§T?pr^i 

SEillLiLL^DiGGiMBlKSFGgYiGLra^^jiL^ 



405 
405 
406 
405 
401 
420 



280 



s|ERQiSVYHSiLFAMIG 
sBSSSlsVYHSDL F AMI G 



224 
224 
225 
224 
225 
239 



290 



300 




438 

PSGPSVPSVP 464 
[STSLVPAMP 465 

TAI^ 448 

HThJpS VPTEP 455 

IpVLEYNN- -HMTQQKH 475 



490 




500 



510 



438 

465 MVSi 

466 IM 

44 8 SEDSl 

456 VEqB tI 

476 QEtI--- 



445 

479 

|PVPPTPPVSLATSAPSAAI.VH 498 

|pEP 459 

467 

488 



Table 15F lists the domain description from DOMAIN analysis results against 
NOV 1 5 . This indicates that the NOV 1 5 sequence has properties similar to tihose of other 
proteins known to contain this domain. 
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Table 15F Domain Analysis of NOV15 




gnl |Pfam|pfam00909, Airaaoni\im__transp, Ammonium Trsmsporter Family. 




CD-Length - 


395 residues, 94.4% aligned 




Score = 


166 






NOV15 : 


48 


»LSDMENBFyYRYPSPQDVH--VMVFVGFGFLMTFLQRYGFSAVGFNFLIAAFGIQWA^ 


105 






1 H 1 1+ + III* + 1 Mi 1 1 i nil i 




Sbjct: 


23 


GLVRSKNVIJTILyKNFQDVAIGVLA.VWGFGYSIjAFGDSY - FSGFXGNIXsLLAAGIQWGTL 


oX 


NOV15 : 


106 


MQGWFHFLQDRY- - IWGVENIiINADFCVASVCVAFGAVI*GKVSPIQLIjIMTFPQVTI.FA 


163 






II 1 + + + + 1+ 1 + t l+il + + + 1 




Sbjct: 


82 


PD6LFFI.FQI1MFAATAITI I SGAVAERIKFS AYLLFSALLGTLVYPPVAHWVWGEGGWLA 




N0V15 : 


164 


VNEFILLNLLK:VKDAGGSMTIHTFGAYFGLTVTRILYRRm.EQSK^RQNSVYQSDLFAM 


223 






1 II +11 1 1 II +11 +1 + + II++ 




Sbjct: 


142 


KLGVLV DFAGSTWHIFGGYAGIAAALVLGPRIGRFTKN-EAITPHNI^FAVL 




NOV15: 


224 


GTLFLWMYWPSFNSAI S YHGDSQHR- AAINTYCSLAACVLTSVAI S SALHKKGKIJ>MVHI 


282 






Mill 1 Ik + 1^1 11 + 11 +1 lh+ II Ml +1+ + 




Sb j Ct : 


194 


GTLrJJWFGWFGFNAGSALTAIXaUUlAAAVNTNIJ^GGALTJ^ -LKT6KPNMLGI. 


251 


NOV15: 


283 


QNATIAGGVAVGTAAEMMLMPYGAL I igfvcgi istlgfvyltpflesrlhiqdtcginn 


342 






1 III 11+ 1 ++ l+llllll + 1++I 11+ 1+ +11 I + 




Sbjct: 


252 


ANGALAGLVAITPAC-GWSPWGALIIGLIAGVIjSVLGY KIiKEKLGIDDPLDVFP 


305 


NOV15: 


343 


LHGIPGIIGGIVGAVTAASASLEVYGKBGLVHSPDFQGFNGDWTARTQGKFQIYGLLVTL 


402 






+11+ II III + II 11+ + 1 1+ 1+ 1 1 




Sbjct: 


306 


VHGVGGIWGGIAVGIFAALYVNTSGIYGGLL YGNSKQLGVQLIGIAVIL 


354 


NOVIB : 


403 


AMALMGGIIVGLI I,RLPFWGQ--PSDENCFEDAVY 435 (SEQ ID NO: 249) 


Sb j ct : 


355 


1 1 I+II+ 11+ + 1 +1 

AYAFGVTFILGLLIiGLTLGLRVSEEEEKVGIiDLAEHGETAY 395 (SEQ ID NO: 250) 



A number of evolutionarily-related proteins have been found to be involved in the 
transport of ammonium ions across membranes. See InterPro IPR001905. Members of this 
family include Yeast ammonium transporters MEPl, MEP2 and MEP3, Arabidopsis thaliana 
high affinity ammonium transporter (gene AMTl), Corynebacterium glutamicum 
ammonium and methylammonium transport system, Escherichia coli putative ammonium 
transporter amtB. Bacillus subtilis nrgA, Mycobacterium tuberculosis hypothetical protein 
MtCY33 8.09c, Synechocystis strain PCC 6803 hypothetical proteins slIOlOS, sI10537 and 
slll017, Methanococcus jannaschii hypothetical proteins MJ0058 and MJ1343, and 
Caenorhabditis elegans hypothetical proteins C05E1 1.4, F49E1 1.3 and M195.3. 

As expected by their transport function, these proteins are highly hydrophobic and 
seem to contain from 10 to 12 transmembrane domains. 

The protein similarity information, expression pattern, cellular localization, and map 
location for the NOVl 5 protein and nucleic acid disclosed herein suggest that this Rh type C 
Glycoprotein-like protein may have important structural and/or physiological functions 
characteristic of the Rh type C Glycoprotein family. Therefore, the nucleic acids and proteins 
of tfie invention are useful in potential diagnostic and therapeutic applications and as a 
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research tool. These include serving as a specific or selective nucleic acid or protein 
diagnostic and/or prognostic marker, wherein the presence or amount of the nucleic acid or 
the protein are to be assessed. These also include potential therapeutic applications such as 
the following: (i) a protein therapeutic, (ii) a small molecule drug target, (iii) an antibody 
target (therapeutic, diagnostic, drug targeting/cytotoxic antibody), (iv) a nucleic acid useful in 
gene therapy (gene delivery/gene ablation), (v) an agent promoting tissue regeneration in 
vitro and in vivo^ and (vi) a biological defense weapon. 

The Rh blood group antigens are associated with human erythrocyte membrane 
proteins of approximately 30 kD, the so-called Rh30 polypeptides. Heterogeneously 
glycosylated membrane proteins of 50 and 45 kD, the Rh50 glycoproteins, are coprecipitated 
with the Rh30 polypeptides on immunoprecipitation with anti-Rh-specific mono- and 
polyclonal antibodies. The Rh antigens appear to exist as a multisubunit complex of CD47, 
LW, glycophorin B, and play a critical role in the Rh50 glycoprotein. 

Ridgwell et al, (1992) isolated cDNA clones representing a member of the Rh50 
glycoprotein family, the Rh50A glycoprotein. The cDNA clones containing the fiill coding 
sequence of the Rh50A glycoprotein predicted a 409-amino acid N-glycosylated membrane 
protein with up to 12 transmembrane domains. It showed clear similarity to the Rh30A 
protein in both amino acid sequence and predicted topology. The findings were considered 
consistent with the possibility that the Rh30 and Rh50 groups of proteins are different 
subunits of an oligomeric complex which is likely to have a transport or channel function in 
the erythrocj/te membrane. By analysis of somatic cell hybrids, they mapped the Rh50A gene 
to 6p21-qter, indicating that genetic differences in the genes for the Rh30 polypeptide, rather 
than the Rh50 genes, specify the major polymorphic forms of the Rh antigens, because the 
Rh blood group maps to chromosome 1, not chromosome 6. Cherif-Zahar et al. (1996) 
carried out 5 regional assignments of the Rh50 gene by isotopic in situ hybridization and 
concluded that it maps to 6p2] ,1-pll, probably 6pl2. 

The Rh(null) types, Rh(null) regulator and Rh(mod) (in which trace amounts of Rh 
antigens are found), exhibit the same clinical abnormalities associated with chronic hemolytic 
anemia, stomatocytosis and spherocytosis, reduced osmotic fragility, and increased cation 
permeability. In addition, Rh(null) membranes characteristically have hyperactive membrane 
ATPases and reduced red cell cation and water content. Cherif-Zahar et oL (1996) proposed 
that mutant alleles of Rh50 are suppressors of the RH locus and account for most cases of 
Rh-deficiency. They analyzed the genes and transcripts encoding Rh, CD47, and Rh50 
proteins in 5 unrelated Rh(null) cases and identified 3 types of Rh50 mutations in the 
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transcripts and genomic DNA from them. The first mutation was observed in homozygous 
state in 2 apparently unrelated individuals originating from South Africa and involved a 2-bp 
transversion and a 2-bp deletion, introducing a fiameshift after the codon for tyrosine-51 

(1 80297.0001) . They stated that, since the Rh50 glycoprotein was not detectable by flow 
cytometry or Western blot analysis on the red cells of these 2 mdividuals, it is likely that the 
predicted truncated Rh50 polypeptide (1 07 residues instead of 409) from these variants was 
degraded and not inserted into the membrane. The second mutation consisted of a single base 
deletion at nucleotide 1086, resulting in a frameshift after the codon for alanine-362 

(180297.0002) . The deduced Rh50 protein was 376 amino acids long (instead of 409) and 
included 14 novel residues at its C terminus* Surprisingly, this mutation was found in the 
heterozygous state by RFLP analysis. Attempts to amplify the product of the second Rh50 
allele were unsuccessfiil, strongly suggesting that this transcript was either absent or poorly 
represented in reticulocytes. Cherif-Zahar et ah (1996) assumed that this allele was 
transcriptionally silent and that the subjects erythrocytes should carry half the normal dose of 
a truncated Rh50 protein. Interestingly, flow cytometry and Western blot analysis indicated a 
complete absence of the protein. They noted that RH and Rh50 proteins interact with each 
other and suggested that the C terminus of Rh50 may stabilize this interaction or may 
represent a site of protein-protein interaction critical for cell surface expression. 

The third Rh50 mutation identified by Cherif-Zahar et aL (1 996) was a missense 
mutation caused by a G236A transition (1 80297.0003). Flow cytometry and Western blot 
analysis indicated that the mutant protein was expressed at the cell surface at only 20% of the 
wild type level. Cherif-Zahar et aL (1 996) provided a diagram of the implication of the 3 
mutations in 4 patients with the Rh(null) phenotype of the regulator type. In the fifth subject 
with Rh(null) phenotype studied by Cherif-Zahar et al (1996), all attempts to amplify the 
Rh50 transcript were unsuccessful, although Rh, CD47, and LW sequences were easily 
amplified and sequenced from reticulocyte RNAs. This suggested that the Rh50 gene was 
transcriptionally silent in this variant, as had been observed in 1 allele of the subject with the 
deletion of nucleotide 1086. Findings in these cases indicated to the authors that Rh antigens 
are significantly expressed only when Rh50 proteins are present. Cherif-Zahar et al, (1996) 
stated, however, that the converse is not true; a small amount of Rh50 may reach the cell 
surface in the absence of Rh proteins as indicated by the Rh(null) variant of the silent type. 
The identification of different Rh50 mutations may account for the well known heterogeneity 
of Rh(null) individuals classified as regulator and Rh(mod) types. 
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Huang et al (1998) described compound heterozygosity for 2 mutations in the Rh50 
glycoprotein gene. An 836G-A mutation in exon 6 resulted in a gly279-to-glu substitution, 
changing a central amino acid of the transmembrane segment 9. While cDNA analysis 
showed expression of the 836A allele only, genomic studies showed the presence of both 
836A and 836G alleles. A detailed analysis of gene organization led to the identification in 
the 836G allele of a defective donor splice site, caused by a G-to-A mutation in the invariant 
GT element of the splice donor site of intron 1 . 

The Rh(mod) syndrome is a rare genetic disorder thought to result from mutations at a 
'modifier' separate from the suppressor underlying the regulator type of Rh(null) disease, Le., 
the RHAG gene. Huang et al (1999) studied this disorder in a Jewish family with a 
consanguineous background and analyzed RH and RHAG, the 2 loci that control Rh-antigen 
expression and Rh-complex assembly. Despite the presence of a d (D-negative) haplotype, no 
other gross alteration was found at the RH locus, and cDNA sequencing showed a normal 
structure of D, Ce, and ce Rh transcripts in family members. However, analysis of the RHAG 
transcript identified a single G-to-T transversion in the initiation codon, causing a missense 
amino acid change: ATG (met) to ATT (ile) (180297.0007). 

Huang (1998) determined the intron/exon structure of the Rh50 gene. The structure of 
the Rh50 gene is nearly identical to that of the Rh30 gene. Of the 10 exons assigned, 
conservation of size and sequence was confined mainly to the region from exons 2 to 9, 
suggesting that RH50 and RH30 were formed as 2 separate genetic loci from a conmion 
ancestor via a transchromosomal insertion event. 

The absence of the RhAG and Rh proteins in Rh(null) individuals leads to 
morphologic and functional abnormalities of erythrocytes, known as the Rh-deficiency 
syndrome. The RhAG and Rh polypeptides are erythroid-specific transmembrane proteins 
belonging to the same family (36% identity). Marini et al (1997) and Matassi et al (1998) 
found significant sequence similarity between the Rh family proteins, especially RhAG, and 
Mep/Amt ammonium transporters. Marini et al (2000) showed that RhAG and also RhGK 
(605381), a human homolog expressed in kidney cells only, function as ammonium transport 
proteins when expressed in yeast. Both specifically complement the growth defect of a yeast 
mutant deficient in ammonium uptake. Moreover, ammonium efflux assays and growth tests 
in the presence of toxic concentrations of the analog methylammonium indicated that RhAG 
and RhGK also promote ammonium export. The results provided the first experimental 
evidence for a direct role of RhAG and RhGK in ammonium transport and were of high 
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interest, because no specific ammonium transport system had been previously characterized 
in human. 

Heitman and Agre (2000) diagrammed the phylogenetic tree of multiple sequences 
from human Rh blood group antigens, human Rh glycoproteins, nonhuman sequences with 
Rh homology, and ammoniimi transporters from yeast, bacteria, plants, and wonns. In 2 
apparently unrelated subjects originating from South Africa and showing the Rh(null) 
phenotype of the regulator type (2681 50), Cherif-Zahar et al (1996) found that nucleotide 
154-157 was changed from CCTC to GA (a 2-bp transversion and a 2-bp deletion), 
introducing a frameshift after the codon for tyrosine-51 and resulting in a premature stop 
codon at codon 107. 

In a subject with Rh(null) of the regulator type (268150), Cherif-Zahar e/ a/. (1996) 
found heterozygosity for a deletion of adenine-1086 which introduced a frameshift after the 
codon for alanine-362 and resulted in a premature stop codon at codon 376. In a subject with 
Rh(null) of the 'mod' type (268150), Cherif-Zahar et al (1996) found a missense mutation, 
ser79 to asn, caused by a G-to-A transition at nucleotide 236. The other allele was apparently 
silent. 

Hyland et al (1998) reported molecular findings in the case of an Rh(null) (268150) 
individual, Y.T., for whom the regulator or amorph type had never been formally 
documented, although the donor's cells were used in several biochemical studies. Preliminary 
family studies showed that functional D and C antigens were transmitted from Y.T. to 3 
children, suggesting that Y.T. belonged to the regulator type. Molecular studies showed that 
Y.T. inherited the mutation from her mother and was a compound heterozygote (composite 
heterozygote in the terminology of Hyland et al, 1998), canying 1 mutant Rh50 allele and 1 
transcriptionally silent Rh50 allele. The Rh50 mRNA was found to contain an 836G-A 
transition yielding a missense and nonconservative gly279-to-glu (G279E) amino acid 
substitution within a predicted hydrophobic domain of the membrane protein. Y.T. was found 
by study of genomic DNA to be carrying both an 836A allele and an 836G allele but only the 
836A sequence was represented in cDNA, indicating that the 836G allele was silent. 

Huang et al (1998) demonstrated compound heterozygosity of the Rh50 gene as the 

basis of the Rh(null) phenotype. One mutation was an 836G-A mutation resulting in a 

missense change, gly279 to glu, in exon 6. The other mutation was a change of the invariant 

GT element of the splice donor site of intron 1 to AT. The blood sample in this case was from 

a female proband (Y.T.) of Australian origin. Serologic tests confirmed the null status of Rh 

antigens (D-C-E-c-e- and Rhl7-). See 1 80297.0004 and Huang et al (1998). The same 
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mutation was found by Cherif-Zahar et al (1998) in homozygous state in a patient in 
California with Rh(null) of the regulator type (268150). Cherif-Zahar et al (1 998) described 
splicing mutations in the Rh50 gene in 2 unrelated patients with the *typical Rh(null) 
syndrome' (268150). The first mutation affected the invariant G residue of the 3-prime 
acceptor splice site of intron 6, causing the skipping of the downstream exon and the 
premature termination of translation. The second mutation occurred at the first base of the 5- 
prime donor splice site of intron 1 (180297.0005). Both of these mutations were found m 
homozygous state. 

In a Jewish family of Russian origin with a consanguineous background, Huang et al 
(1 999) found that the basis of the Rh(mod) syndrome was a met-to-ile mutation in the 
initiation codon of the RHAG transcript. This point mutation occurred in the genomic region 
spanning exon 1 of RHAG. The presence of the mutation in the mother and 2 children was 
confirmed by SSCP analysis. Although blood typing showed a very weak expression of Rh 
antigens, immunoblotting barely detected the Rh proteins in Rh(mod) membrane. In vitro 
transcription-coupled translation assays showed that the initiator mutants of Rh(mod), but not 
those of the wild type, could be translated from ATG codons downstream. The findings 
pointed to incomplete penentrance of the Rh(mod) mutation, in the form of 'leaky' translation, 
leading to some posttranslational defects affecting the structure, interaction, and processing 
of Rh50 glycoprotein. The mother in this pedigree (S.M.) and her brother (S.S.) were first 
described as cases of Rh(null). S.M. had a well-compensated hemolytic anemia, whereas S.S. 
had a normal hematologic count with numerous spherocytes and stomatocytes after 
splenectomy. S.M. was found to be homozygous for the mutation; SS was deceased at the 
time of study. The 2 children of S,M. were heterozygotes. 

In 1 patient with Rh-null disease of the regulator type (268150), Huang (1998) 

detected a shortened Rh50 transcript lacking the sequence of exon 7. They identified a G-to- 

A transition at the +1 site of IVS7 in homozygosity in this patient. This splicing mutation 

caused not only a total skipping of exon 7 but also a frameshift and premature chain 

termination. Thus, the deduced translation product contained 351 instead of 409 amino acids, 

with an entirely different C-terminal sequence following thr315. Huang et al (1999) 

demonstrated that a Japanese patient with Rh-null hemolytic anemia of the regulator type 

(268150) was homozygous for 2 cis mutations in the RHAG gene: in exon 6, G-to-A 

transitions, GTT to ATT and GGA to AGA, which caused val270-to-ile and gly280-to-arg 

substitutions, respectively. In a Japanese patient with Rh-null hemolytic anemia of the 

regulator type (268150), Huang et al (1999) identified a G-to-T transversion in exon 9 of the 
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RHAG gene, converting GGT (gly) to GTT (val) at codon 380 in the transmembrane- 12 
segment. The transversion, which was located at the +1 position of exon 9, had also affected 
pre-mRNA splicing and caused partial exon skipping. Despite a structurally normal Rh 
antigen locus, hemagglutination and immunoblotting showed no e^qjression of Rh antigens or 
proteins. 

See: Cherif-Zahar, et al. Blood 92: 2535-2540, 1998. PubMed ID: 9746795; Cherif- 
Zahar, etal. Nature Genet. 12: 168-173, 1996. PubMed ID: 8563755; Heitman and Agre, 
Nature Genet. 26: 258-259, 2000. PubMed ID: 1 1062455; Huang, C.-H., J. Biol. Chem. 273: 
2207-2213, 1998. PubMed ID: 9442063.1; Huang, et al. Am. J. Hemat. 62: 25-32, 1999. 
PubMed ID: 10467273; Huang, et al. Am. J. Hum. Genet. 64: 108-117, 1999. PubMed ID: 
9915949; Huang, et al. Blood 92: 1776-1784, 1998. PubMed ID: 9716608; Hyland, et al. 
Blood 91: 1458-1463, 1998. PubMed ID: 9454778; Marini, et al. Nature Genet. 26: 341-344, 
2000. PubMed ID: 1 1062476; Marini, et aL, Trends Biochem. Sci. 22: 460-461, 1997. 
PubMed ID: 9433124; Matassi, et al. Genomics 47: 286-293, 1998. PubMed ID: 9479501 ; 
and Ridgwell, et aL, Biochem. J. 287: 223-228, 1992. PubMed ID: 1417776. 

The NOV15 nucleic acids and proteins of the invention have applications in the 
diagnosis and/or treatment of various diseases and disorders. For example, the compositions 
of the present invention will have efficacy for the treatment of patients suffering from: 
hemolytic anemia, stomatocytosis and spherocytosis, reduced osmotic fragility, and increased 
cation permeability; Rh(mod) syndrome, Rh(null)disease; Rh deficiency syndrome; 
ammonium transport; Von Hippel-Lindau (VHL) syndrome, Alzheimer's disease, stroke, 
tuberous sclerosis, hypercalceimia, Parkinson's disease, Huntington's disease, cerebral palsy, 
epilepsy, Lesch-Nyhan syndrome, multiple sclerosis, ataxia-telangiectasia, leukodystrophies, 
behavioral disorders, addiction, anxiety, pain, neurodegeneration; fertility, hypogonadism; 
diabetes, autoimmune disease, renal artery stenosis, interstitial nephritis, glomerulonephritis, 
polycystic kidney disease, systemic lupus erythematosus, renal tubular acidosis, IgA 
nephropathy, hypercalceimia, Lesch-Nyhan syndrome; Glutaricaciduria, type IIA; 
Hypercholesterolemia, familial, autosomal recessive; Tyrosinemia, type I as well as other 
diseases, disorders and conditions. 

These materials are further useful in the generation of antibodies that bind 
immunospecifically to the novel substances of the invention for use in therapeutic or 
diagnostic methods. These antibodies may be generated according to methods known in the 
art, using prediction from hydrophobicity charts, as described in the "Anti-NOVX 

125 



Antibodies** section below. The disclosed NOVl 5 protein has multiple hydrophilic regions, 
each of which can be used as an immunogen. In one embodiment, a contemplated NOV 15 
epitope is from about amino acids 40 to 55. In another embodiment, a contemplated NOVl 5 
epitope is from about amino acids 195 to 215. In other specific embodiments, contemplated 
NOVl 5 epitopes are from about amino acids 240 to 255, 290 to 295, 340 to 345 and 360 to 
365. 



NOV16 

A disclosed NOVl 6 (designated CuraGen Acc. No. CG571 69-01), which encodes a 
novel Copine Ill-like protein and includes the 1763 nucleotide sequence (SEQ ID NO:43) is 
shown in Table 16A. An open reading frame for the mature protein was identified beginning 
with an CTG initiation codon at nucleotides 111-113 and ending with a TAT stop codon at 
nucleotides 1758-1760. Putative untranslated regions are underlined in Table 16 A, and the 
start and stop codons are in bold letters. 



Table 16A. NOVl 6 Nucleotide Sequence (SEQ ID NO:43) 

AGCTCAGGTCGGGTTCTCGTAGCTGGTGGGGGGCAGGTTTTTATGCTTGAAATACTGCACAACTTGTTGGGGC^ 

CGCCAGCACAGCTTTGGCCAAGGTCTCTTTTGC TGCGTTGCGGAACTCTCGAAAGGGAACGAACTGCACAAT^ 

GGCTGCCTCCrCCCCCGTGTGGGAGCGCAGCS^TGCGGCIXSTCCCCATCCAGGAACTCCATGGC^ 

GCCCACGCCCACGATGATGATGGACATGGGCAGCTTGGAAGCCTGCACCACGGCATGCCGTGTCTCCT 

GATGACCCCGTCaSTGATGATGAGGAGGATGAAGTACTGCGTGGCCGTCCGOTGTTGTGTGGCCTGGGCCGCJ^ 

GGCCACGTGGTTGACGATGGGGGAGAAATTGGTAGGACCGTAGAAGCGGATGTGGGGCAGGCAAGCTGAGTACGCC^ 

GGCAATACCM^CCACACCraAGCAGAAGGGGTTGGTGGGGTTGAAGTTGATGGCAAACT 

tgggggtaactgggccccgaatcccagagctggaaacatcttatcactgtcgtactcctgaatga 

ccagatggcx:gac».gatattcgttggtgcccatagggttgatatagtgcaaagaggaagggtcgagg^^ 

ggaggctgtaaagtctattccaacggtgaact^tgagctggcagcctcccaggatgtagtcaaggaa^ 

GTTTATCTTGCAGGATCGCAGGATGATGATGCCCGAGTTTTTATAGTTCTTCTTCrTCCT 

GCACrCGAACTCCAGCGGGACGCrGTCTCGAGCCTCACACATCTGTGACACTGAGGTCTGGM 

ATGGCCCCCGTCATTGTCATAGTCGTAGCACATGACCTGGATGGGCTTCTCCa^TGTCCCCATCACACAGGGA 

GGGCACTGTGAATGGCPTCCAmCAGGGTCCAGTGTGTACTTGATCACCTCAGTCCTGTGGACCAGCA 

ATCGTCTCCTGGCTTATAAAACTCCAGAAAGGGGTCTGACTTCCCMAGAGGTCCTTCTTGTCCAGCCTCC^ 

CAGGCTTAGTGTGATGACGCGGTTGTCGGACAGCTCCTGGGCAGCGATCGTAATCS^GCCCTTCCCCGC^ 

ATTCAGCAGCA6CAGAGGCCTAGTGATCTTCTTGCTGGAGACGATCGTGCCCAGGCTGCAGGAGAACT 

GTCaVTGCTCGTCCAGCOKIATACTGGACTTGTCCrGGTCAAA^^ 

GTAGTCAAGCACGAACTTCTTGGAGAAGGCGGGGTTGAGGTTGTTGATCGCXSGTTTCTGTCCTGTCGTACTC^ 
TCTGCCATTGTTCTCTGTAAAGAGGACTVCAGAAGGGGTCGGACTTGGAGGTAACATCCCGGTCC^ 
ACTCACTGACAGCTCCACCTTGCACACGCAATACTGGGGGCCCATGGGGGCTGCCCCCGCTO 
GGGTATGTGGGCCATGGGAGCXZGGTGGCGGTGGCAGGAGTTCCTGGCAGTCGCAGGTCCCGCGGGCGCCACa^ 
ACCGCACGGCTGCCGCTGCCCGCGCTCCGAGCCACCCGGGGTATCCT 



The disclosed NOVl 6 nucleic acid sequence maps to chromosome 16 and has 924 of 
1344 bases (68%) identical to a gb:GENBANK-ID:HSAI33798|acc:AJ133798.1 mRNA 
from Homo sapiens (Homo sapiens mRNA for copine VI protein) (E = 1.5e"'^^). 

A disclosed NOV 16 polypeptide (SEQ ID NO:44) is 549 amino acid residues in 
length and is presented using the one-letter amino acid code in Table 16E. The SignalP, Psort 
and/or Hydropathy results predict that NOVl 6 does not have a signal peptide and is likely to 
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be localized to the endoplasmic reticulum (membrane) with a certainty of 0,6850. In 
alternative embodiments, a NOV 16 polypeptide is located to the plasma membrane with a 
certainty of 0.6400, the Golgi body with a certainty of 0.4600, or the endoplasmic reticulum 
(lumen) with a certainty of 0. 1 000. 

5 

Table 16B. Encoded NOV16 Protein Sequence (SEQ ID NO:44) 

FVIiDYHFEEVQKLKFAIJ?X>QDKBSMRLDEHDFLGQFSCSLGTIVSSKKITRPLLLLl^ 
ITLSLAGRRLDKKDLFGKSDPFLEFYKPGDIXSKWMLVHRTEVIKyTIJDPWKPFTVPLVSLC^ 

NDGGHDFIGEFQTSVSQMCEARDS VPLEFECINPKKQRKKKNYKNSGI I ILRSCKINRDYSFLDYILGGCQIjMFTVGI 
DFTASNGNPIJ^PSSLHYINPMGTNEYLSAIWAVGQIIQDYDSDKMFPALGFGAQLPPDWKVSHEFAINFNPTOT 
VDGIAQAYSACLPHIRFYGPTNFSPIVIOHVARFAAQATQQRTATQYFILLIITDGVISDMEETRHAVVQASKLPMSII 
IVGVGNADFAAMEFLIXSDSRMLRSHTGEEAARDIVQFVPFREFRNAAKETLAKAVL^ 

NPT 

The NOV 16 amino acid sequence was found to have 341 of 527 amino acid residues 
(64%) identical to, and 427 of 527 amino acid residues (81%) similar to, the 537 amino acid 
residue ptnr:SWISSNEW-ACC:075131 protein from Homo sapiens (Human) (COPINE III) 
10 (E = 5.1e"'^^). 

NOV 16 is expressed in at least the following tissues: Bone, Brain, Ovary, Spinal 
Chord, and Uterus. Expression information was derived from the tissue sources of the 
sequences that were included in the derivation of the sequence of NOV 16. 

NOV 16 also has homology to the amino acid sequences shown in the BLASTP data 
15 listed m Table 16C. 



Table 16C. BLAST results for NOV16 


Gene Xndex/ 
Identifier 


Protein/ Organism 


Length 
(aa) 


Identity 
(%) 


Positives 
(%) 


Expect 


gi|l4714939|gb|A 
AH10627.l|AAH106 
27 (BC010627) 


Unknown (protein 
for MGC: 16924) 
[Homo sapiens] 


446 


442/444 
(99%) 


443/444 
(99%) 


0.0 


gi| 15318878 ref | 
XP_053605.1 
(XM 053605) 


hypothetical 
protein XP_053605 
[Homo sapiens] 


358 


354/356 
(99%) 


355/356 
(99%) 


0.0 


gi 1 4503015 1 ref |N 
P_003900.l| 
(NM 003909) 


copine III [Homo 
sapiens} 


537 


339/523 
(64%) 


424/523 
(80%) 


0.0 


gi [4503013 tref|N 
P_003906.l| 
(NM 003915) 


copine X [Homo 
sapiens} 


537 


311/531 
(58%) 


400/531 
(74%) 


0.0 


gi| 14193684 |gb| A 
AK56087.l|AF3320 
58 1 (AF332058) 


copine 1 protein 
[Mus musculus] 


454 


267/453 
(58%) 


351/453 
(76%) 


e-162 



The homology of these sequences is shown graphically in the ClustalW analysis 
shown in Table 16D* 
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Table ClustalW Analysis of NOV16 



1) NOV16 



(SEQ 



2) gi 

3> gi 

4) gi 

5) gi 

6) gi 



14714939 (SEQ 

15318878 (SEQ 

4503015 (SEQ 

4503013 (SEQ 

14193684 (SEQ 



ID NO:44) 
ID NO:251) 
ID NO: 252) 
ID NO:253) 
ID NO:254) 
ID NO:255> 



UOVIS 

gi I 14714939 
gi 1 15318878 
gij 4503015 I 
gi I 45030131 
gij 14193684 



NOV16 

gi I 14714939 
gij 15318878 
gij 4503015 | 
gij 4503013 [ 
gi [14193684 



NOV16 

gi I 14714939 
gij 15318878 
gij 4503015 j 
gij4503013i 
gij 14193684 



NOV16 

gi 1 14714939 
gij 15318878 
gij 4503015 I 
gi j 4503013 j 
gij 14193684 



NOV16 

gi I 14714939 
gij 15318878 
gij 4503015 I 
gi 14503013 j 
gi 1X4193684 



NOV16 

gi I 14714939 
gij 15318878 
gii4503015| 
gi i 4503013 j 
gij 14193684 



NOV16 

gi 1 14714939 
gij 15318878 
gij 4503015 I 



10 

MAHIPSGGAPAAGAAPMGPQYg|c: 




310 



320 



330 



340 



350 



360 
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gi 14503013 I 
gij 14193684 I 



N0V16 

gi 1 14714939 I 
gi 1153188781 
gi I 4503015 I 
gi 1 4503013 I 
gi 1 14193684 I 



NOV16 

gi 114714939 I 
gij 15318878 I 
gi|4503015| 
gij 4503013 I 
gij 14193684 I 



N0V16 

gi 1 14714939 I 
gi 1 15318878 I 
gij 4503015 1 
gi I 4503013 I 
gi[ 14193684 I 




550 




560 
|....| 

PT 549 

PA 446 

PA 358 

ATKQQKQ- 537 
AQAPQA 537 
454 



Table 16E lists the domain description from DOMAIN analysis results against 
NOV16. This indicates that the NOV16 sequence has properties similar to those of other 
proteins known to contain these domains. 



Table 16E Domain Analysis of NOV16 

gnl I Smart |sraartO0239, C2, Protein kinase C conserved region 2 (CalB) ; Ca2+- 
binding motif present in phospholipases, protein kinases C, and 
synaptotamins (among others) . Some do not appear to contain Ca2+- binding 
sites. Particular C2s appear to bind phospholipids, inositol polyphosphates, 
and intracellular proteins. Unusual occurrence in perforin. Synaptotagmin 
and PLC C2s are permuted in sequence with respect to N- and C- terminal beta 
strands. SMART detects C2 domains using one or both of two profiles. 

CD-Length - 101 residues, 87.1% aligned 

Score := 64.7 bits (156), Expect = le-11 

NOV16: 161 LAGRRLDKKDLFGKSDPFLEFYKPGDIXSKWMLVHRTBVIKyTLDPW-KPFTVPLVSLCD 219 

M !I II + ll+ill + I + 

Sbjct: 7 ISARNLPPKDKGGKSDPYVKVSLDGDPRE KKKTKVVKNTLNPVWNETFEFEVPPPEL 63 

NOV16: 220 GDMEKPIQVMCyDYDNDGGHDFIGEFQTSVSQMCB 254 (SBQ ID NO: 256) 

+++ II I Mil +1 + 

Sbjct: 64 SELEIEVYDKDRPSRDDFIGRVTIPLSDLLL 94 (SEQ ID NO:257) 

CD-Length = 101 residues, 93.1% aligned 
Score = 62.4 bits (150), Expect = 7e-ll 

NOV16: 30 VSGQNia^iU>VTSKSDPFCVLFTEmGRWIEYDRTETAIiraLNPAFSKKFVIJ^ 89 

+1 +11 +1 IIII+ + + + I I +1+ I ill +++ I + 1+ 
Sbjct: 7 ISARNLPPKDKGGKSDPYVKVSLDGDPR--:^KKTKVVKNTLNPVWNETFEFEVPPPELS 64 

NOV16: 90 KLKFALFDQDKSSMRLDEHDFLGQFSCSLGTIVSSKKITR 129 (SEQ ID NO: 258) 

+1+ ++I+I+ I I [+1+ + 1 ++ + + 

Sbjct: 65 ELEIEVYDKDRPS RDDFIGRVTIPLSDLLLG6RHEK 100 (SEQ ID NO: 259) 
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gnl I Pf am | pf amO 0168, C2 , C2 domain , 

CD-I*ength = 88 residues, 93.2% aligned 
Score = 56.6 bits (135), Expect = 4e-09 

NOV16: 30 VSGQNI>LDIU3VTSKSDPFCVI.FTEEJNCaaWIEYDRTETAINNr^ 88 

+1 +11 1+ 111++++++ +1+1 III +++ II + 

Sbjcti 6 ISARNLPKMDMNGLSDPYVKVDLDQDPKDTKKFKTKTVKKTLNPVW^ 65 

NOV16: 89 QKLKFALFDQDKSSMRLDEHDFLGQF 114 (SEQ ID NO: 260) 

I 11 + 11 

Sbjct: 66 ASLRFAVYDEDRFS RDDFIGQV 87 (SEQ ID NO: 261) 

CD-Length = 88 residues, 93.2% aligned 
Score = 56.6 bits {135), Bxpect - 4e-09 

NOV16: 161 LAGRRLDKKDLFGKSDPFLEFYKPGDDGKWMLVHRTEVIKYTLDPVW-KPFTVPLVSLCD 219 

++ I II k I I I +1+ H Ihlll + 1 11 I 

SbJct: 6 ISARNLPKMDMNGLSDPYVKV-DriDGDPKDTKKFKTKTVKKTIiNPVWNETF\^ 64 

N0V16: 220 GDMEKPIQVMCYDYDNDGGHDFIGEF 245 (SEQ ID NO: 262) 
11 I lllh 

Sbjct: 65 I. ASLRFAVYDEIMIFSRDDFIGQV 87 (SEQ ID NO: 2 63) 



gnl I Smart I smar to 032 7, VWA, von Willebrand factor (vWF) type A domain; VWA 
domains in extracellular eukaryotic proteins mediate adhesion via metal ion- 
dependent adhesion sites (MIDAS) . Intracellular VWA domains and homologues 
in prokaryotes have recently been identified » The proposed VWA domains in 
integrin beta subunits have recently been substantiated using sequence -based 
methods {Ponting et al - Adv Prot Chem (2000) in press) . 

CD-Length ~ 180 residues, 92.2% aligned 

Score = 40.8 bits (94), Expect = 2e-04 
(SEQ ID NO: 2 64) 

NOV16: 333 MGTNBYLSAIWAVGQIIQDYDSDKMFPALGFGAQLPPDWKVSHEFAINFNPTNPFCSGVD 392 

II I + I t ++++ I +1 I + + I + I 

Sbjct: 14 MGGNRFELAKEFVLKLVEQLDIGPDGDRVGL VTFSSDARVLPPLND — SQSKD 64 

NOV16: 393 GIAQAYSACLPHIRFYGPTNFSPIVNHVARPAAQATQQRTATQYFILLIITDGVISD-ME 451 

+ +1 ++ III + + + +I++IIII +1 I 

Sbjct: 65 ALLEALASLSYS--LGG6TNL6AALSYALENLFSESAGSRRGAPKVLILITD6ESNDG6B 122 

NOV16: 452 ETRHAWQASKLPMSIIIVGVGNA-DFAAMEFLDGDSRMLRS-HTGEEAARDIVQFV 506 

+ (+++++[|||[| ++[ + +++ 
Sbjct: 123 DILKAAKELKRSGVKVFWGVGNDVDEEELKKLASAPGGVFWEDLPSLLDLLIDLL 179 
(SEQ ID NO:265) 



Some isozymes of protein kinase C (PKC) contain a domain, known as C2, of about 
116 amino-acid residues which is located between the two copies of the CI domain (that bind 
phorbol esters and diacylglycerol) (see PROSITEDOC PDOC00379 ) and the protein kinase 
catalytic domain (see PROSITEDOC PDOCOOlOO ). Regions with significant homology to 
the C2-domain have been found in many proteins. The C2 domain is thought to be involved 
in calcium-dependent phospholipid binding. Since domains related to the C2 domain are also 
found in proteins that do not bind calcium, other putative functions for the C2 domain like 
e.g, binding to inositol- 1, 3 j4,5-tetraphosphate have been suggested. 

The 3D structure of the C2 domain of synaptotagmin has been reported, the domain 
forms an eight-stranded beta sandwich constructed around a conserved 4-stranded motif. 
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designated a C2 key. Calcium binds in a cup-shaped depression formed by the N- and C- 
terminal loops of the C2-key motif. The domain information provided in Table 16E 
indicates that the sequence of the invention has properties similar to those of other proteins 
known to contain this/these domain(s) and similar to the properties of these domains. 

Molecular events at the interface of the cell membrane and cytoplasm may be 
regulated by proteins that attach to and detach from the membrane surface in response to 
signals. Calcium-dependent membrane-binding proteins may play such a role. To identify 
proteins that may underlie membrane trafficking processes in ciliates, Creutz et al (1998) 
isolated calcium-dependent phospholipid-binding proteins from Paramecium. They named 
the major protein that they obtained 'copine' (pronounced 'ko-peen'), the French feminine 
noun meaning 'friend/ because it associates like a 'companion' with lipid membranes. The 55- 
kD copine protein bound phosphatidyl serine in a calcium- but not magnesium-dependent 
manner, but it did not bind phosphatidylcholine. Copine promoted calcium-dependent 
aggregation of lipid vesicles. The authors cloned partial cDNAs representing 2 distinct 
Paramecium copine genes. By searching sequence databases for genes with sequence 
similarity to the Paramecium copine genes, Creutz et al (1998) identified human ESTs 
corresponding to 5 copine genes, named copine I to V. Two overlapping ESTs contained the 
complete copine I (CPNEl) coding sequence. The deduced 537-amino acid CPNEl protein 
contains 2 type II C2 domains in its N-terminal half and a domain similar to the A domain, 
which is present in a number of extracellular proteins or the extracellular portions of 
membrane proteins, in its C-terminal half; it does not have a predicted signal sequence or 
transmembrane domains. C2 domains mediate calcium-dependent interactions with 
phospholipids, and the A domain of integrins appears to mediate the binding of the integrin to 
extracellular ligands. CPNEl has a broad tissue distribution. Recombinant CPNEl expressed 
in bacteria exhibited calcium-dependent binding to phosphatidylserine vesicles. Antibody 
against CPNEl reacted with bovine chromobindin-17, which is a 55-kD calcium-dependent 
chromaffin vesicle-binding protein, and the authors concluded that chromobindin-17 is a 
copine. They suggested that copines function in membrane trafficking. See Creutz, et al, J. 
Biol. Chem. 273: 1393-1402, 1998. PubMed ID : 9430674. 2. Ishikawa, et al, DNA Res. 5: 
169-176, 1998. PubMed ID : 973481 1. 

The protein similarity information, expression pattern, cellular localization, and map 
location for Ae NOV16 protein and nucleic acid disclosed herein suggest that this Copine Ill- 
like protein may have important structural and/or physiological functions characteristic of the 
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Copine III family. Therefore, the nucleic acids and proteins of the invention are useful in 
potential diagnostic and therapeutic applications and as a research tooL These include serving 
as a specific or selective nucleic acid or protein diagnostic and/or prognostic marker, wherein 
the presence or amount of the nucleic acid or the protein are to be assessed. These also 
include potential therapeutic applications such as the following: (i) a protein therapeutic, (ii) a 
small molecule drug target, (iii) an antibody target (therapeutic, diagnostic, drug 
targeting/cytotoxic antibody), (iv) a nucleic acid useful in gene therapy (gene delivery/gene 
ablation), (v) an agent promoting tissue regeneration in vitro and in vivo, and (vi) a biological 
defense weapon. 

The NOV16 nucleic acids and proteins of the invention have applications in the 
diagnosis and/or treatment of various diseases and disorders. For example, the compositions 
of the present invention will have efficacy for the treatment of patients suffering from: Von 
Hippel-Lindau (VHL) syndrome, Alzheimer's disease, stroke, tuberous sclerosis, 
hypercalceimia, Parkinson's disease, Huntington's disease, cerebral palsy, epilepsy, Lesch- 
Nyhan syndrome, multiple sclerosis, ataxia-telangiectasia, leukodystrophies, behavioral 
disorders, addiction, anxiety, pain, neurodegeneration, cancer, trauma, tissue regeneration (in 
vitro and in vivo), viral/bacterial/parasitic infections, immunological disease, respiratory 
disease, gastro-intestinal diseases, reproductive health, neurological and neurodegenerative 
diseases, bone marrow transplantation, metabolic and endocrine diseases, allergy and 
inflammation, nephrological disorders, cardiovascular diseases, muscle, bone, joint and 
skeletal disorders, hematopoietic disorders, urinary system disorders, systemic lupus 
erythematosus, autoimmune disease, asthma, emphysema, scleroderma, allergy, ARDS, 
fertility, as well as other diseases, disorders and conditions. 

These materials are further useful in the generation of antibodies that bind 
immunospecifically to the novel substances of the invention for use in therapeutic or 
diagnostic methods. These antibodies may be generated according to methods known in the 
art, using prediction from hydrophobicity charts, as described in the "Anti-NOVX 
Antibodies" section below. The disclosed NOVl 6 protein has multiple hydrophilic regions, 
each of which can be used as an immunogen. In one embodiment, a contemplated NOVl 6 
epitope is from about amino acids 30 to 90. In another embodiment, a contemplated NOV 16 
epitope is from about amino acids 95 to 98. In other specific embodiments, contemplated 
NOV 16 epitopes are from about amino acids 99 to 105, 120 to 122, 130 to 132, 140 to 190, 
210 to 220, 260 to 290, 320 to 330, 340 to 375, 400 to 410, 420 to 440 and 490 to 550. 
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NOV17 

A disclosed NOV17 (designated CuraGen Acc. No. CG57177-01), which encodes a 
novel Carboxypeptidase B, Pancreatic-like protein and includes the 1070 nucleotide sequence 
(SEQ ID NO:45) is shown in Table 17A. An open reading frame for the mature protein was 
identified beginning with an ATG initiation codon at nucleotides 1-3 and ending with a TAG 
stop codon at nucleotides 1048-1050. Putative untranslated regions are underlined in Table 
17A, and the start and stop codons are in bold letters. 



Table 17A. NOV17 Nucleotide Sequence (SEQ ID NO AS) 

ATGTTGGCACTCTTGGTTCTGGTGACTGTGGCCCTGGCATCTGCTCATCATGGTGGTGAGCACTTTGAA 

GGGGAGAAGGTGTTCCGTGTTAACGTTGAAGATGAAAATCACATTAACATAATCCGCGAGTTGGCCACC 

TTTATTCAGATTGACITCTGGAAGCCAGATTCTGTCACACAAATCMACCT^^ 

CGTGTTAAAGCTIGAAGATACTGTCACTGTGGAGAATGTTCTAAAGCAGAATGAACTACAATAC^ 

CTGATAAGCAACCTGAGAAATGTGGTGGAGGCTCAGTTTGATAGCCGGGTTCGTGCAACAGGACACAGT 

TATGAGAAGTACAACAAGTGGGAAACGATAGAGGCTTGGACTCAACAAGTCGCCACTGAGAATCCAGCC 

CTCATCTCTCGCAGTGTTATCGGAACCACATTTGAGGGACGCGCTATTTACCTCCTGAAGGTTGGCAAA 

GCTGGACAAAATAAGCCTGCCATTTTCATGGAATGTGGTTTCCATGCCAGAGAGTGGATTTCTCCTGCA 

TTCTGCCAGTGGTTTGTAAGAGAGGCTGTTCGTACCTATGGACGTGAGATCCAAGTGACAGAGCTTCTC 

GACAAGTTAGACTTTTATGTCCTGCCTGTGCTCAATATTGATGGCTACATCTACACCTGGACCAAGAGC 

CGATTTTGGAGAAAGACTTCGCTCCACCCATACTGGATCTACCCTTACTCATATGCTTACAAACTCGGT 

GAGAACAATGCTGAGTTGAATGCCCTGGCTAAAGCTACTGTGAAAGAACTTGCCTCACTGCACGGCACC 

AAGTACT^CATATGGCCCGGGAGCTACAACAATCTATCCTGCTGCTGGGGGCTCTGACGACTGGGCTTAT 

GACCAAGGAATCAGATATTCCTTCACCTTTGAACTTCGAGATACAGGCAGATATGGCTTTCTCCTTCCA 

GAATCCCAGATCCGGGCTACCTGCGAGGAGACCTTCCTGGCAATCAAGTATGTTGCCAGCTACGTCCTG 

GAACACCTGTACTAGTTGAGAAAGCTGATGGCCTT 

The disclosed NOV 17 nucleic acid sequence maps to chromosome 3 and has 626 of 
729 bases (85%) identical to a gb:GENBANK-ID:DOGZAP47|acc:D78348.1 mRNA from 
Canis familiaris (Dog mRNA for zymogen granule membrane associated protein (ZAP47), 
complete cds) (E - 4.0e"^^'). 

A disclosed NOV17 polypeptide (SEQ ID NO:46) is 349 amino acid residues in 
length and is presented using the one-letter amino acid code in Table 17B. The SignalP, 
Psort and/or Hydropathy results predict that NOV 17 does not have a signal peptide and is 
likely to be localized to the outside of the cell with a certainty of 0.5422. In alternative 
embodiments, a NOV 17 polypeptide is located to the microbody (peroxisome) with a 
certainty of 0.2456^ the endoplasmic reticulum (membrane) with a certainty of 0.1000, or the 
endoplasmic reticulum (lumen) with a certainty of 0.1000. 



Table 17B. Encoded NOV17 Protein Sequence (SEQ ID NO:46) 

MLALLVLVTVAIASAHHGGEHFEGEKVFRVNVEDElSrHINIIRELATF 

TVENVLKQlSnELQYKVL I SNLRNVVEAQFDSRVRATGHSYEKYNKWETIEAWTQQVATENPALI SR^ 

LLKVGKAGQNKPAIFMECGFHAREWISPAFCQWFVREAVRTYGREIQVTELLDKIiDFYVLPVm 

RKTSLHPYWIYPYSYAYKLGENNAEIjNALAKATVKELASLHGTKYTyGP^ 

RDTGRYGFLLPESQIRATCEETFLAIKYVASYVLEHLY 
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The NOVl 7 amino acid sequence was found to have 234 of 240 amino acid residues 
(97%) identical to, and 236 of 240 amino acid residues (98%) similar to, the 416 amino acid 
residue ptnr:pir-id:A42332 protein from human (carboxypeptidase B (EC 3*4.17.2) precursor, 
pancreatic) (E = 5.4e ^^^). 

NOV 17 is expressed in at least the following tissues: pancreas, blood, stomach . 
Expression information was derived from the tissue sources of the sequences that were 
included in the derivation of the sequence of NO VI 7. 

Possible small nucleotide polymorphisms (SNPs) found for NOVl 7 are listed in 
Table 17C. 



Table 17C: SNPs 


Variant 


Nucleotide 
Position 


Base Change 


Amino Acid 
Position 


Base Change 


13374719 


516 


A>C 


172 


Glu>Asp 



Other NOV 17 variants include the nucleic acids depicted in Table 17D and the 
proteins depicted in Table 17E. 



Table 17D. Alignment of DNA sequences for NOV17 and variants 



1S9648881 
169648885 
169648904 
169648937 
HOV17 



169648881 
169648885 
169648904 
169648937 
IIOV17 



169648881 
169648885 
169648904 
169648937 
»0V17 



10 



20 



30 



ATGTTGGCACTCTTGGTTCTGGTGACTGTGGCO 
60 70 80 




90 



100 



I 



TGGTGGTGAGCACTTTGAAGGCGAGAAGGTGTTCCGTGTTAACGTTGAAG 
IGGTGGTGAGCACTTTGAAGGCGAGAAGGTGTTCCGTGTTAACGTTGAAG 
rGGTGGTGAGCACTTTGAAGGCGAGAAGGTGTTCCGTGTTAACGTTGAAG 
rGGTGGTGAGCACTTTGAAGGCGAGAAGGTGTTCCGTGTTAACGTTGAAG 

iggtggtgagcactttgaaggSgagaaggtgttccgtgttaacgttgaag 



6X 
61 
61 
61 
100 



110 



120 



130 



140 



150 




160 170 180 190 200 

169648881 
169648885 
169648904 
169648937 



gacttctggaagccagattctgtcacacaaatcaaacctcacagtacagt 
gacttctggaagccagattctgtcacacaaatcaaacctcacagtacagt 
gacttctggaagccagattctgtcacacaaatcaj\acctcacagtacagt 
gacttctggaagccagattctgtcacacaaatcaaj^cctcacagtacagt 
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HOV17 



ACCTCACAGTACAi 



200 



169648881 
169648885 
169648904 
169648937 
HOV17 



I 



210 



I 



220 



230 



240 



I 



1 



250 



TGACTTCCGTGTTAAAGCAGAAGATACTGTCACTGTGGAGAATGTTCTAA 
TGACTTCCGTGTTAAAGCAGAAGATACTGTCACTGTGGAGAATGTTCTAA 

:tgacttccgtgttaaagcagaagatactgtcactgtggagaatgttctaa 
Itgacttccgtgttaaagcagaagatactgtcactgtggagaatgttctaa 



211 
211 
211 
211 
250 



169648881 
169648885 
169648904 
169648937 
NOV17 



260 



270 



280 



290 



300 



gcagaatgaactacaatacaaggtactgataagcaacctgagaaatgtg 

?kGCAGAATGAACTACAATACA?LGGTACTGATAAGCAACCTGAGAAATGTG 
A.GCAGAATGAACTACAATACAAGGTACTGATAAGCAACCTGAGAAATGTG 
?\GCAGAATGAACTACAATACAAGGTACTGATAAGCAACCTGAGAAATGTG 
^GCAGAATGAACTACAATACAAGGTACTGATAAGCAACCTGAGAAATGTG 



261 
261 
261 
261 
300 



169648881 
169648885 
169648904 
169648937 
1IOV17 



310 



320 



330 



340 



350 



GTGGAGGCTCAGTTTGATAGCCGGGTTCGTGCAACAGGACACAGTTATG 
GTGGAGGCTCAGTTTGATAGCCGGGTTCGTGCAACAGGACACAGTTATGA 
GTGGAGGCTCAGTTTGATAGCCGGGTTCGTGCAACAGGACACAGTTATGA 
GTGGAGGCTCAGTTTGATAGCCGGGTTCGTGCAACAGGACACAGTTATGA 



311 
311 
311 
311 
350 



169648881 
169648885 
169648904 
169648937 
NOV17 



360 



370 



380 



390 



400 



GAAGTACAACAAGTGGGAAACGATAGAGGCTTGGACTCAACAAGTCGCCA 
GAAGTACAACAAGTGGGAAACGATAGAGGCTTGGACTCAACAAGTCGCCA 
GAAGTACAACAAGTGGGAAACGATAGAGGCTTGGACTCAACAAGTCGCCA 
GAAGTACAACAAGTGGGAAACGATAGAGGCTTGGACTCAACAAGTCGCCA 
GAAGTACAACAAGTGGGAAACGATAGAGGCTTGGACTCAACAAGTCGCCA 



361 
361 
361 
361 
400 



169648881 
169648885 
169648904 
169648937 
HOV17 



410 



420 



430 



440 



450 



CTGAGAATCCAGCCCTCATCTCTCGCAGTGTTATCGGAACCACATTTGAG 
CTGAGAATCCAGCCCTCATCTCTCGCAGTGTTATCGGAACCACATTTGAG 
CTGAGAATCCAGCCCTCATCTCTCGCAGTGTTATCGGAACCACATTTGAG 
CTGAGAATCCAGCCCTCATCTCTCGCAGTGTTATCGGAACCACATTTGAG 
CTGAGAATCCAGCCCTCATCTCTCGCAGTGTTATCGGAACCACATTTGAG 



411 
411 
411 
411 
450 



169648881 
169648885 
169648904 
169648937 
NOV17 



460 



470 



480 



490 



500 



GGACGCGCTATTTACCTCCTGAAGGTTGGCAAAGCTGGACAAAATAAGCC 
GGACGCGCTATTTACCTCCTGAAGGTTGGCAAAGCTGGACAAAATAAGCC 
GGACGCGCTATTTACCTCCTGAAGGTTGGCAAAGCTGGACAAAATAAGCC 
GGACGCGgTATTTACCTCCTGAAGGTTGGCAAAGCTGGACAAAATAAGCC 
GGACGCGCTATTTACCTCCTGAAGGTTGGCAAAGCTGGACAAAATAAGCC 



169648881 
169648885 
169648904 
169648937 
NOV17 



510 



520 



530 



540 



550 




169648881 
169648885 
169648904 
169648937 
NOV17 



560 



570 



580 590 



600 



:CATTCTGCCAGTGGTTTGTAAGAGAGGCTGTTCGTACCTATGGACGTGAG 
CATTCTGCCAGTGGTTTGTAAGAGAGGCTGTTCGTACCgATGGACGTGAG 
CATTCTGCCAGTGGTTTGTAAGAGAGGCTGTTCGTACCTATGGACGTGAG 

cattcSgccagtggtttgtaagagaggctgttcgtacctatggacgtgag 



: ATT CgG 

:attctg 



561 
561 
561 
561 
600 



610 620 630 640 



650 
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169648881 


ATCCAAGTGACAGAGCTTCTCGACAAGTTAGACTTTTATGTCCTGCCTGT 


611 


169648885 


ATCCAAGTGACAGAGCTTCTCGACAAGTTAGACTTTTATGTCCTGCCTGT 


611 


169648904 


ATCCAAGTGACAGAGCTTCTCGACAAGTTAGACTTTTATGTCCTGCCTGT 


611 


169648937 


ATCCAAGTGACAGAGCTTCTCGACAAGTTAGACTTTTATGTCCTGCCTGT 


611 


HOV17 


ATCCAAGTGACAGAGCTTCTCGACAAGTTAGACTTTTATGTCCTGCCTGT 


650 




660 670 680 690 700 

. I. ... I. J. 


169648881 


GCTCAATATTGATGGCTACATCTACACCTGGACCAAGAGCCGATTTTGGA 


661 


169648885 


GCTCAATATTGATGGCTACATCTACACCTGGACCAAGAGCCGATTTTGGA 


661 


169648904 


GCTCAATATTGATGGCTACATCTACACCTGGACCAAGAGCCGATTTTGGA 


661 


169648937 


GCTCAATATTGATGGCTACATCTACACCTGGACCAAGAGCCGATTTTGGA 


661 


NOV17 


GCTCAATATTGATGGCTACATCTACACCTGGACCAAGAGCCGATTTTGGA 


700 



710 720 730 740 750 

^^^J^U^ 693 



760 770 780 790 800 

169648881 693 

169648885 693 

169648904 693 

169648937 693 

NOV17 AAACTCGGTGAGAACaATGCTGAGTTGAATGCCCTGGCTAAAGCTACTGT 800 



810 820 830 840 850 

169648881 693 

169648885 693 

169648904 693 

169648937 693 

NOV17 GAAAGAACTTGCCTCACTGCACGGCACCAAGTACACATATGGCCCGGGAG 850 

860 870 880 890 900 

..,.|....|....|....|....|....|....|....|....|,...| 

169648881 693 

169648885 693 

169648904 693 

169648937 693 

HOV17 CTACa^CAATCTATCCTGCTGCTGGGGGCTCTGACGACTGGGCTTATGAC 900 



910 920 930 940 950 

169648881 693 

169648885 693 

169648904 693 

169648937 693 

lilOV17 CTUySGAATa^GATATTCCTTCACCTTTGAACTTOSAGATACaGGC^ 950 



960 970 980 990 1000 

.... |. ...!.... 1.. I. ...|....|....|....|....|.... I 

169648881 693 

169648885 693 

169648904 693 

169648937 ' 693 

HOV17 TGGCTTTCTCCTTCCAGAATCCCAGATCCGGGCrACCTGCGAGGAGACCT 1000 



1010 1020 1030 1040 1050 

....|....|....|...,|....|....|....|....|....|..,.| 

169648881 693 

169648885 693 

169648904 693 

169648937 693 
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NOV17 TCCTGGCAATCAAGTATGTTGCCAGCrACGTCCTGGAAa^CCT 1050 

1060 1070 



169648881 693 (SEQ ID NO:47) 

169648885 693 (SEQ ID NO:49) 

169648904 693 (SEQ ID NO:51) 

169648937 693 (SEQ ID NO: 53) 

UOV17 TTGAGAAAGCTGATGGCCTT 1070 (SEQ ID NO: 45) 



Table 17E, Alignment of protein sequences for NOV17 and variants 

_ ^ _ _ 

J,... |.,,.|. .|....|... .1 

169648881 37 

169648885 37 

169648904 ^^^^^^^gS^SEgJ^^SIRSQ^^^^ffl 37 

169648937 BS8^SS^^98SSBB3^^ 37 

60 70 80 90 100 

169648881 ^^^^^^^^^^^^^^^^^j^^^^^^^^^^^ 87 

169648885 ^^^^S^^fB^^^RS^^SSS^RI^^^S^ 87 

169648904 ^S^^^^^^^^^^^^^fflS^JR^SS^ISR^^B^ 87 

169648937 ^^^^SRl^!S9^^B9^nS9^9^^ 87 



169648881 
169648885 
169648904 
169648937 
NOV17 



169648881 
169648885 
169648904 
169648937 
NOV17 



110 



120 



13 0 



140 



150 



EAQFDSRVRATGHSYEKYNKWETISAWTQQVATENPALISRSVIGTTFE 
VEAQFDSRVRATGHSYSKYNKWSTIEAWTQQVATENPALISRSVIGTTFE 
7EAQFDSRVRATGHSYEKYNKWETISAWTQQVATENPALISRSVIGTTFE 
VEAQFBSRVRATGHSYEKYNKWETISAWTQQVATSNPALISRSVIGTTFE 
VEAQFDSRVRATGKSYEKYNKWSTIEAWTQQVATENPALISRSVIGTTFE 



180 




190 



200 



ISPAFCQ 
ISPAFCQ 
ISPAFCQ 
ISPAFi^Q 
ISPAFCQ 




210 220 230 240 250 

J..^ .... I 

169648881 ^^^^^^^^^^A^g^^^^gagg^^!^^ 231 

169648885 ^ 231 

169648904 ^^^M^S^SM^^^^^SSjS8^|9MB 231 

169648937 m|^^^^™ffl^S^|^^KW^HSHSCT 231 

260 270 280 290 300 

....|....|....|....|....|....|....|....|....|....| 

169648881 231 

169648885 231 

169648904 231 

169648937 231 

NOV17 KLGENNAELNAUUKATVKEIiASLHGTKYTYGPGATTIYPAAGGSDDWAyD 300 

310 320 330 340 350 
.... |.... |.. .. |. ... I-,.. |....|....|,. ..!....[.... I 
169648881 231 

169648885 231 

169648904 231 
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169648937 

NOV17 QGIRYSFTFELRiyrGRYGFIiLPESQIRATCEETFLAIKYVASYVLEHLYr. 350 

....| 

169648881 231 (SEQ ID NO:48) 

169648885 231 (SEQ ID NO:50) 

169648904 231 (SEQ ID NO:52) 

169648937 231 (SEQ ID NO:54) 

NOV17 RKLMA 355 (SEQ ID NO;46) ^ 



NOV17 also has homology to the amino acid sequences shown in the BLASTP data 
listed in Table 17F. 



Table 17F. BLAST results for NOV17 


Gene Index/ 
Identifier 


Protein/ Organism 


Length 
(aa) 


Identity 
(%) 


Positives 
(%) 


Expect 


gi|4503003|ref 
|NP_001862.1| 
(NM_001871) 


pancreatic 
carboxypeptidase Bl 
precursor; 
pancreas- specif ic 
protein [Homo 
sapiens) 


416 


292/416 
(70%) 


303/416 
(72%) 


e-150 


gi| 15929839 |gb 
lAAH15338.l|AA 

H15338 
(BC015338) 


unknown (protein 
for MGC:21282) 
[Homo sapiens] 


417 


291/417 
(69%) 


303/417 
(71%) 


e-150 


gi|3915628|sp| 
P15086|CBPB_ 


HUMAN 

CARBOXYPEPTIDASE B 
PRECORSOR 
( PANCREAS - SPECI FI C 
PROTEIN) (PASP) 


417 


290/417 
(69%) 


303/417 
(72%) 


e-150 


gi| 5457422 lemb 
|CAB46991.1| 
{AJ133775) 


procarboxypept idase 
B [Sus scrofa] 


416 


239/416 
(57%) 


272/416 
(64%) 


e-122 


git 1705666 |sp| 
P5526l|CBPB__CA 
NPA 


Carboxypeptidase B 
precursor (47 kDa 
zymogen granule 
membrane associated 
protein) (ZAP47) 


416 


237/416 
(56%) 


272/416 
(64%) 


e-122 



The homology of these sequences is shown graphically m the ClustalW analysis 
shown in Table 17G. 



Table 17G. ClustalW Analysis of NOV17 

1) NOV17 (SEQ ID NO: 46) 

2) gi|4503003 (SEQ ID NO:266) 

3) gi I 15929839 (SEQ ID NO:267) 

4) gij3915628 (SEQ ID NO:268) 

5) gi I 5457422 (SEQ ID NO: 269) 

6) gi 1 1705666 (SEQ ID liO:270) 
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MLALLVLVTVALASAHHGGEKFEGEKVFRVNVEDENHINIIRELASTTQIDFWKPDSVTQ 
MLALLVLVTVALASAHHGGSHFEGSK\7FRVN^/SDHNHINIIRELASTTQIDFWKPDSVTQ 

^ aLMLVTiALASAHl aGSHFEGBKVFRVI^/EDSNHI^gSgLASTTQIDFT/JKPDSVTQ 
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gi I 15S2583$ | 
gl 1 3515628 I 
gi 1 5457422 j 
gi I 17056661 



N0V17 

gi [4503003 | 
gi 1 15929839 I 
gi|3915628 | 
gi 15457422] 
gi 1 1705666 I 



NOV17 

gi|4503003| 
gi 1 15929839 I 
gi 1 3915628 1 
gi 1 54574221 
gi 1 1705666 1 



N0V17 
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gi 1 15929839 [ 
gi 1 3915628 I 
gi I 5457422 I 
gi I 17056661 
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IKPHSTVDFRVKAEDTVTVENVLKQNELQYiOyLISNLRNWEAQFDSRVRATGKSYEKYK 
IKPHSTVT3FRVKAEDTVTVEWLKQNELQYKVLISNLRN^/VEAQFBSRVRATGHSYEK'iN 

ik?hst^/dfrvkaedtwvenv:lkqnelqykvlisnlrnvveaqfdsrvratghsyeky^^ 

IKPHST^/DFRVKAEDTVTVEWLKQNELQYKVLISNLRNWEAQFDSRVRATGHSYEKYN 
IK?HSTVDFRVKAED|gvEl3L|QNELQYi^^.IlNLR#v^|^ 

IKPHSTgDFR->rKAED 1tVE| aLKQNELlgYiVLliNLRiiviEgOFgMvmTGHgYSKYXT 
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KWiiTIEAWTQQVATENPALISRSVIGTTFSGRAIYLLK^/GKAGQNKPAIFMlCGFHARE^A 
KWETIEAWTQQVATENPALISRSVIGTTFEGRAIYLLIO/GKAGQNKPAIFMDCGFHAREW 
KWETIEAWTQQVATENPALISRSVIGTTFEGRAIYLLJO/GKAGQNKPAIFIvIDCGFHAREW 
KWETIEAWTQQVATENPALISRSVIGTTFEGRAIYLLKVGKAGQNKPAIFMDCGFHAREW 

WETIEAWTgQ^^ENFjLISR^IGTTFjGSlYLLKVGKSGilNKPAIFMDCGFHA^ 
"^■'"^lEAl^TOnVii^SNpBLISR^IGTTFEGRRlYLLKVGKannxrKPaTPr^Tnr^ 
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^igtdpnrnfdagwc|.igasrnpcdetycgpaaeseketkaladfirnBlssikay 

pCIGTD?NRNFDAGWC|lGASRNPCDSTYCGPAAESEKETKALADFIRi\msSIKA^ 
gCIGTDPNRNFDAGWXglGASRNPCDETYCGPAASSEKBTKALADFIRNBLSSIKAYi 
|BCpTDPNRNFDAGWCpGASjHpCDETYCG|AAESEKETKALADFIRNSLSSTKAYi 
~ ^"DP3RNFDAGWcIlGASRNPCDETYCG?AAESSKETKAr,A.^PT^^T.QCTT^av 
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gij 15929839 [ 
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gij 3915628 1 
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gij 5457422 1 
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gij 1705666] 
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^gY^gglYPYSYAYKLGENNAELNALAKATVKELASLKGTKYTYGPGAT'T'IYt^AAG 
LTIHSYSQMMIYPYSYAYKLGENNASLNALAKATVKELASLHGTKYTYGPGATTIYPAAG 
LTIHSYSQMPAiYPYSYAYKLGENNAELNALAKATVKELASLHGTKYTYGPGATTIYPAAG 
LTIKSYSQM?/!lYPYSYAYKLGENNAELNALAKATVKELASLKGTKYTYGPGATTIY?AAG 
LTIKSYSQ^I^^?YSYgYKL|EiWELr^3]lAKASvKELA|^ 

LTIHSYSQMRiYPYSYiSYKLaENNASLNAIAKATVKELAiLHGTKYTYGPGATTTVPAL 
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359 
360 
360 
360 
359 



370 



380 




Table 17H lists the domain description from DOMAIN analysis results against 
NOV17. This indicates that the NOV17 sequence has properties similar to those of other 
proteins known to contain these domains. 
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Table 17H Domain Analysis of NOV17 

HMM file: pfamHMMs 

Scores for sequence family classification (score includes all domains) : 
Model Description Score E~value N 



Zn_carbOpept (InterPro) Zinc carboxypeptidase 357.0 2e-103 2 

Propep_M14 (InterPro) Carboxypeptid activation pept 138.1 1.6e-37 1 

Parsed for domains: 



Model 


Domain 


seq 


seq 


himn 


bmm 


score 


£- 


-value 






from 


to 


from 


to 












Propep_Ml4 


1/1 


26 


105 . . 


1 


82 [] 


138. 


1 


1 


.6e- 


37 


Zn__carb<>pept 


1/2 


119 


236 .. 


1 


125 [. 


206. 


6 


3 


.8e- 


58 


Zn__carbOpept 


2/2 


242 


332 .. 


204 


304 .3 


149. 


5 




6e- 


41 



Alignments of top-scoring domains: 

Propep__M14: domain 1 of l, from 26 to 105: score 138.1, E = 1.6e-37 
* - >qVlrvkvadedC2vkllkdLentehleLDFWkpdsatpikpgstvDf r 
NOV17 26 KVFRVNVEDENHINIIREIATFI--QIDFWKPDSVTQIKPHSTVDFR 70 

VpaediqavksfLeqsglhYevlIeDVqelLeeqf <-* (SEQ ID NO: 271) 
NOV17 71 VKAEDTVTVENVLKQNEIK}YKVLISNLRKVVEAQF 105 (SEQ ID N0:272) 



Zn_carbOpept : domain 1 of 2, from 119 to 236: score 206.6, E = 3.8e-58 

* - >YhnleeiyawlDllvsnf PdLvskvsiGksyeGRdlkvLKisdnpat 
I +++ 1 + 1 + 1 I ++++++++ 1 I + 1 +++ 1 I +++ 1 I I +++ I I ++^ 
NOV17 119 YNKWETIEAWTQQVATENPALISRSVIGTTFEGRAIYLUCVGKA 162 

genePevf avagWiHAREwvtsAt 1 1 wllkel vanYgsDkt i tklldgld 

l + + l + l 11 + 11 

NOV17 163 GQNKPAIFMECG-FHAREWISPAPCQWFVREAVRTYGREIQVTELLDKLD 211 
lfyilpvfl!^D<^aYsittdSyRmWRKt<-* (SEQ ID NO: 273) 

IIHII hlll + k+h I III! 

NOV17 212 -FYVI,PVI.NIDGYIYTWTKS--RFWRKT 236 (SEQ ID NO:274) 



Zn_carbOpept : domain 2 of 2, from 242 to 332: score 149.5, E = 6e-41 



NOV17 



NOV17 



NOV17 



*->llyPYgydynlnpdandldelsdlkiaadalsarhgtyYtlglpgss 
++tll + l l + l +++ +I++I+ 1+ ++ |+++|||4.|| + | II++ 
242 WIYPYSYAYKI^ENNAEiaJAIA--KATVKEIJ^IfiGTKYTYG-P6AT 285 

1 1 YpasAGGsdDwaydvgi ikyaf t f ElrpdtgsyGnPCFl iPeeql ip t 

mii^ iiiiiiiii+i I i+iiiMi iii+ii IIIII+M+ I 

286 TIYPAA-GGSDDWAYDQG-IRYSFTFELR-DT6RYG FLLPESQIRAT 329 



gsee<-* 
I 

330 CE-E 332 



(SEQ ID NO: 275) 
(SEQ ID NO:276) 



140 



The carboxypeptidase A family (Ml 4) can be divided into two subfamilies: 
carboxypeptidase H (regulatory) and carboxypeptidase A (digestive). Members of the H 
family have longer C-termini than those of family A , and carboxypeptidase M (a member of 
the H family) is bound to the membrane by a glycosylphosphatidylinositol anchor, unlike the 
majority of the MM family, which are soluble. See, InterPro IPR000834. 

The zinc ligands have been determined as two histidines and a glutamate, and the 
catalytic residue has been identified as a C-terminal glutamate, but these do not form the 
characteristic metalloprotease HEXXH motif. Members of the carboxypeptidase A family 
are synthesised as inactive molecules with propeptides that must be cleaved to activate flie 
enzyme. Structural studies of carboxypeptidases A and B reveal the propeptide to exist as a 
globular domain, followed by an extended alpha4ielix; this shields the catalytic site, without 
specifically binding to it, while the substrate-binding site is blocked by making specific 
contacts. 

The domain information indicates that the NOV 17 sequence of the invention has 
properties similar to those of other proteins known to contain this/these domain(s) and similar 
to the properties of these domains. 

A human pancreas-specific protein (PASP), previously characterized as a serum 
marker for acute pancreatitis and pancreatic graft rejection, has been identified as pancreatic 
procarboxypeptidase B (PCPB). cDNAs encoding PASP/PCPB were isolated from a human 
pancreas cDNA library using a combination of nucleic acid hybridization screening and 
immunoscreening with antisera raised against native PASP. The deduced amino acid 
sequence of PASP/PCPB cDNA predicts the translation of a 416-amino acid preproenzyme 
with a 15 -amino acid signal/leader peptide and a 95-amino acid activation peptide. The 
proenzyme portion of this protein has 76% identity with rat PCPB and 84% identity with 
bovine carboxypeptidase B. DNA and RNA blot analyses indicate that human PCPB mRNA 
(1,400 nucleotides) is transcribed from a single locus in the human genome in a tissue- 
specific fashion. N-terminal sequencing of native PASP and the specific immunoreactivity of 
bacterially expressed PASP/PCPB with native PASP antibodies confirm the identification of 
PASP as human pancreatic PCPB. PMID: 1370825 

In contrast to procarboxypeptidase B which has always been reported to be secreted 

by the pancreas as a monomer, procarboxypeptidase A occurs as a monomer and/or 

associated to one or two functionally different proteins, depending on the species. Recent 

studies showed that, in the human pancreatic secretion, procarboxypeptidase A is mainly 

secreted as a 44 kDa protein involved in at least three different binary complexes. As 
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previously reported, two of these complexes associated procarboxypeptidase A to either a 
glycosylated truncated protease E or zymogen E. In this paper, we identified proelastase 2 as 
the partner of procarboxypeptidase A in the third complex, thus reporting for the first time the 
occurrence of a proelastase 2/procarboxypeptidase A binary complex in vertebrates. 
Moreover, fi-om N-terminal sequence analyses, the 44 kDa procarboxypeptidase A involved 
in these complexes was identified as being of the Al type. Only one type of 
procarboxypeptidase B, the Bl type, has been detected in the analyzed pancreatic juices, thus 
emphasizing the previously observed genetic differences between individuals. PMID: 
2307232 

Carboxypeptidase Bl is a highly tissue-specific protein and is a useful serum marker 
for acute pancreatitis and dysfunction of pancreatic transplants. It is not elevated in pancreatic 
carcinoma. The protein, referred to as pancreas-specific protein (PSAP) by Yamamoto et aL 
(1992), has a molecular mass of 44,500 Da and constitutes about 2% of total pancreatic 
cytosolic proteins. A computer search of protein sequence data using the first 25 amino acids 
from the N-terminal end suggested that PASP is pancreatic procarboxypeptidase B. 
Yamamoto et al (1992) isolated a cDNA for PASP/PCPB and demonstrated that the deduced 
amino acid sequence represented a 416-amino acid preproenzyme with a 15-amino acid 
signal/leader peptide and a 95-amino acid activation peptide. RNA blot analyses indicated 
that the human PCPB mRNA, with 1,400 nucleotides, is transcribed from a single locus in the 
human genome in a tissue-specific fashion. See Yamamoto, et al, J. Biol. Chem. 267: 2575- 
2581, 1992. PubMed ID : 1370825. 

The protein similarity information, expression pattern, cellular localization, and map 
location for the NOV17 protein and nucleic acid disclosed herein suggest that this 
Carboxypeptidase B, Pancreatic-like protein may have important structural and/or 
physiological functions characteristic of the Carboxypeptidase B, Pancreatic family. 
Therefore, the nucleic acids and proteins of the invention are useful in potential diagnostic 
and thempeutic applications and as a research tool. These include serving as a specific or 
selective nucleic acid or protein diagnostic and/or prognostic marker, wherein the presence or 
amount of the nucleic acid or the protein are to be assessed. These also include potential 
therapeutic applications such as the following: (i) a protein therapeutic, (ii) a small molecule 
drug target, (iii) an antibody target (therapeutic, diagnostic, drug targeting/cytotoxic 
antibody), (iv) a nucleic acid useful in gene therapy (gene delivery/gene ablation), (v) an 
agent promoting tissue regeneration in vitro and in vivo, and (vi) a biological defense 
weapon. 
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The NOV17 nucleic acids and proteins of the invention have applications in the 
diagnosis and/or treatment of various diseases and disorders. For example, the compositions 
of the present invention will have efficacy for the treatment of patients suffering from: 
diabetes. Von Hippel-Lindau (VHL) syndrome, pancreatitis, obesity, ulcers, digestive 
disorders as well as other diseases, disorders and conditions. 

Tliese materials are further useful in the generation of antibodies that bind 
immunospecifically to the novel substances of the invention for use in therapeutic or 
diagnostic methods. These antibodies may be generated according to methods known in the 
art, using prediction from hydrophobicity charts, as described in the "Anti-NOVX 
Antibodies" section below. The disclosed NOV 17 protein has multiple hydrophilic regions, 
each of which can be used as an immunogen. In one embodiment, a contemplated NOV 17 
epitope is from about amino acids 25 to 45. In another embodiment, a contemplated NOV 17 
epitope is from about amino acids 60 to 80. In other specific embodiments, contemplated 
NOV17 epitopes are from about amino acids 80 to 85, 1 10 to 130, 160 to 162, 170 to 172, 
180 to 202, 240 to 260, 265 to 268, 290 to 305 and 310 to 320. 

NOV18 

One NOVX protein of the invention, referred to herein as NOV 1 8, includes two 
Ribosomal Protein L29-like proteins. The disclosed proteins have been named NOV 18a and 
NOV18b. 

NOVlSa 

A disclosed NOV18a (designated CuraGen Acc. No. CG571 1 3-01), which encodes a 
novel Ribosomal Protein L29-like protein and includes the 649 nucleotide sequence (SEQ ID 
NO:55) is shown in Table 18A. An open reading frame for the mature protein was identified 
beginning with an ATG initiation codon at nucleotides 43-45 and ending with a TAG stop 
codon at nucleotides 526-528. Putative untranslated regions are underlined in Table 18A, 
and the start and stop codons are in bold letters. 

Table ISA. NO VI 8a Nucleotide Sequence (SEQ ID NO:55) 

ACTCACTATAGGGCTCGAGCGGCCGCCCGGGCAGGTGC7VGACA TGGCCAAGTCCAAGAACCACACCAC 
ACACAACCTkGTCCCGAAAATGGCACAGAAATGGTATCaAGAAACCCCGATCACAAAGATACGAATCT^ 
TTAAGGGGGTGGACCCCAAGTTCCTGAGGAACATGCGCTTTGCCAAGAAGC^ 

AAGAAGATGCAGGCCAACAATGCCAAGGCCATGAGTGCACGT6CCGAGGCTATCAAGGCCCTCGTAAA 
GCCCAAGGAGGTTAAGCCCAAGATCCCAAAGGGTGTCAGCCGCAAGCTCGATCGACTTGCCTACATTG 
CCCACCCCAAGCTTGGGAAGCGTGCTCGTGCCCGTATTGCCAAGGGGCTCAGGCTGTGCCGGCCAAAG 
GCCAAGGCCAAGGCCAAGGCCAAGGCCAAGGATCAAACCAAGGCCCAGGCTGCAGCCCCAGCTTCAGT 
TCCAGCTCAGGCTCCCAAACGTACCCAGGCCCCTACAAAGGCTTCAGAGTAGATATCTCTGCCAACAT 
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GAGGACAGAAGGACTGGTGCGACCCCCCACCCCCGCCCCTGGGCTACCATCTGCATGGGGCTGGGGTC 
CTCCTGTGCTACTGGTACAAATAAACCTGAGGCAGGA 

The disclosed NOV 18a nucleic acid sequence maps to chromosome 3q29-qter and has 
620 of 630 bases (98%) identical to a gb:GENBANK-ID:HSU10248|acc:U 10248.1 mRNA 
from Homo sapiens (Human ribosomal protein L29 (humrpl29) mRNA, complete cds) (E == 
4.7e^^^). 

A disclosed NOV 18a polypeptide (SEQ ID NO:56) is 161 amino acid residues in 
length and is presented using the one-letter amino acid code in Table 18B. The SignalP, 
Psort and/or Hydropathy results predict that NOV 18a does not have a signal peptide and is 
likely to be localized to the nucleus with a certainty of 0.9840. In alternative embodiments, a 
NOV 18a polypeptide is located to the mitochondrial matrix space with a certainty of 0.1000 
or the lysosome (lumen) with a certainty of 0.1000. 

Table 18B. Encoded NOVlSa Protein Sequence (SEQ ID NO:56) 

MAKSKNHTTHNQSRKWHimGIKKPRSQRYESLKGVBPKFLRNMRFAKKHNK^ 
PK^TVKPKIPKGVSIOCLDRLAYIMPKLGKRARARIAKGLRLCRPKAKA^ 

APTKASE 

The NOVl 8a amino acid sequence was found to have 1 59 of 1 61 amino acid residues 
(98%) identical to, and 159 of 161 amino acid residues (98%) similar to, the 1 59 amino acid 
residue ptnr:pir-id:S65784 protein from human (ribosomal protein L29, cytosolic) (E = 
2.5e^% 

NOVl 8a is expressed in at least the following tissues: adrenal gland, bone marrow, 
brain - amygdala, brain - cerebellum, brain - hippocampus, brain - substantia nigra, brain - 
thalamus, brain -whole, fetal brain, fetal kidney, fetal liver, fetal lung, heart, kidney, 
lymphoma - Raji, mammary gland, pancreas, pituitary gland, placenta, prostate, salivary 
gland, skeletal muscle, small intestine, spinal cord, spleen, stomach, testis, thyroid, trachea 
and uterus. Adipose, Amnion, Aorta, Appendix, Artery, Ascending Colon, Bone, Bronchus, 
Brown adipose. Buccal mucosa. Cartilage, Cerebral Medulla/Cerebral white matter. Cervix, 
Chorionic Villus, Colon, Coronary Artery, Dermis, Epidermis, Foreskin, Frontal Lobe, Gall 
Bladder, Gastro-intestinal/Digestive System, Hair Follicles, Hypothalamus, Kidney Cortex, 
Larynx, Left cerebellum. Liver, Lung, Lung Pleura, Lymph node. Lymphoid tissue. Muscle, 
Ovary, Oviduct/Uterine Tube/Fallopian tube. Parathyroid Gland, Parietal Lobe, Parotid 
Salivary glands. Peripheral Blood, Pineal Gland, Pituitary Gland, Respiratory Bronchiole, 
Retina, Right Cerebellum, Skin, Spongy Bone/Cancellous bone, Synovium/Synovial 
membrane. Temporal Lobe, Thymus, TonsilsUmbilical Vein, Urinary Bladder, Vein, Vulva, 
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White adipose, and Whole Organism. Expression information was derived from the tissue 
sources of the sequences that were included in the derivation of the sequence of NOV] 8a. 

NOVlSb 

A disclosed NOV18b (designated CuraGen Acc. No. CG571 13-02), which includes 
the 580 nucleotide sequence (SEQ ID NO:57) shown in Table 1 8C. An open reading frame 
for the mature protein was identified beginning with an ATG codon at nucleotides 54-56 and 
ending with a TAG codon at nucleotides 537-539. The start and stop codons of the open 
reading frame are highlighted in bold type. Imitative untranslated regions are underlined. 



Table 18C. NOVlSb Nucleotide Sequence (SEQ ID NO:57) 

ACTCAOTATAQGGCTCGAGCGGCQOTTCGQGAGCCGCGGCTTATGQTGCi^ 

CCACACACAACCAGTCCCXSAAAATGGCACAGAAATGGTATCAAGAAACCCCGATmCAAAGAT^ 

AGGGGGTGGACCCCAAGTTCCTGAGGAACATGCGCTTTGCCAAGAAGCACT^C^^ 

AGGCCAACAATGCCAAGGCCATGAGTGCACGTGCCGAGGCTATCAAGGCCCTCGTAAAG^ 

CCAAGATCCCAAAGGGTGTCAGCCGCAAGCTCGATCGACTTGCCTACTiTTC 

CTCGTGCCOSTATTGCCy^GGGGCTCAGGCTGTGCCGGCCAAAGGCCAAGGCC^ 

ATCAAACCMGGCCCAGGCTGCAGCCCCAGCrTCAGTTCCAGCTCAGGCTCCCa^ 

AGGCTTCAGAGTAGATATCTCTGCCAACATGAGGACAGAAAGACTGGTGCGACCC 



The disclosed NOVl 8b nucleic acid sequence maps to chromosome 3q29-qter and 
has 548 of 555 bases (98%) identical to a gb:GENBANK-ID:HSU10248|acc:U10248.1 
mRNA from Homo sapiens (Human ribosomal protein L29 (humrpl29) mRNA, complete 
cds)(E= 1.2e-^^^). 

The NOV 18b polypeptide (SEQ ID NO:58) is 161 amino acid residues in length and 
is presented using the one-letter amino acid code in Table 1 8D. The SignalP, Psort and/or 
Hydropathy results predict that NOVl 8b has a signal peptide and is likely to be localized to 
the nucleus with a certainty of 0.9840. In alternative embodiments, a NOVl 8b polypeptide is 
located to the mitochondrial matrix space with a certainty of 0.1000 or the lysosome (lumen) 
with a certainty of 0.4600. 

Table 18D. Encoded NOV18b Protein Sequence (SEQ ID NO:58) 

MAKSKNHTTHNQSRKWHRNGIKKPRSQRYESLKGVDPKFLRNMRFAIOCH^ 
KPKEVKPKIPKGVSRKIiDRLAYIAHPKLGKRARARXAKGLRLCRPKAKAKAK^ 

TQAPTKASE 

The NOV 18b amino acid sequence was found to have 159 of 161 amino acid residues 
(98%) identical to, and 159 of 161 amino acid residues (98%) similar to, the 159 amino acid 
residue ptnr:pir-id:S65784 protein from human (ribosomal protein L29, cytosolic) (E = 
2.7e-''). 
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NOV 18b is expressed in at least the following tissues: adrenal gland, bone marrow, 
brain - amygdala, brain - cerebellum, brain - hippocampus, brain - substantia nigra, brain - 
thalamus, brain -whole, fetal brain, fetal kidney, fetal liver, fetal lung, heart, kidney, 
lymphoma - Raji, mammary gland, pancreas, pituitary gland, placenta, prostate, salivary 
gland, skeletal muscle, small intestine, spinal cord, spleen, stomach, testis, thyroid, trachea, 
uterus. Adipose, Amnion, Aorta, Appendix, Artery, Ascending Colon, Bone, Bronchus, 
Brown adipose. Buccal mucosa. Cartilage, Cerebral Medulla/Cerebral white matter. Cervix, 
Chorionic Villus, Colon, Coronary Artery, Dermis, Epidermis, Foreskin, Frontal Lobe, Gall 
Bladder, Gastro-intestinal/Digestive System, Hair Follicles, Hypothalamus, Kidney Cortex, 
Larynx, Left cerebellum. Liver, Lung, Lung Pleura, Lymph node. Lymphoid tissue. Muscle, 
Ovary, Oviduct/Uterine Tube/Fallopian tube, Parathjroid Gland, Parietal Lobe, Parotid 
Salivary glands. Peripheral Blood, Pineal Gland, Pituitary Gland, Respiratory Bronchiole, 
Retina, Right Cerebellum, Skin, Spongy Bone/Cancellous bone, Synovium/Synovial 
membrane. Temporal Lobe, Thymus, TonsilsUmbilical Vein, Urinary Bladder, Vein, Vulva, 
White adipose, and Whole Organism. Expression information was derived from the tissue 
sources of the sequences that were included in the derivation of the sequence of NOVl 8b. 

The sequence is predicted to be expressed in heart because of the expression pattern 
of (GENBANK-ID: gb:GENBANK-ID:HSU10248|acc:U 10248.1) a closely related Human 
ribosomal protein L29 (humrpl29) mRNA, complete cds homolog in species Homo sapiens. 

The nucleic acids for NOVl 8a and NOVl 8b are very closely homologous as is shown 
in the alignment in Table 18E. The disclosed NOVl 8a and NOVl 8b proteins are identical. 




Table 18E. Alignment of DNA sequences for NOVlSa and NOVlSb 

10 20 30 40 50 

CG57113-01 NOVlSa ^^ggg^^^^^^^^^gCGCC 
C657113-02 KOVlSb IBBH3RBB33833eBS33SS sCTTMtfef^^ 

60 70 80 90 100 

CG57113-01 NOVlSa ^^^^g^^^^^^ggg^g^^^SgH|ggg^^^^^^^^^g 

110 120 130 140 150 

C657113-01 NOVlSa ^^^^^^^^^g^ggra^^gj^g^g^^^^gg^S^ 

160 170 180 190 200 

.... I .... I , ... I .... I .... I .... I .. .. I .... I .... I .... I 
CG57113-01 NOVlSa g 
CG57113-02 NOVlSb ^ 

210 220 230 240 250 
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GGGGGTGGACCCCAAGTTCCTGAGGAACATGCGCTTTGCCAAGAAGCAC 
GGGGGTGGACCCCAAGTTCCTGAGGAACATGCGCTTTGCCAAGAAGCAC 



AACAAAAAGGGCCTAAAGAAGATGCAGGCCAACAATGCCAAGGCCATGAG 
AACAAAAAGQGCCTAAAGAAGATGCAGGCCAAC2VATGCCAAGGCCATGAG 



CG57113-01 NOVlSa 
C657113-02 KOVX8b 



CG57113-01 NOVlSa 
CG57113-02 NOVlSb 



260 



270 280 
..|....|....|.. 



290 



300 



TGCACGTGCCGAGGCTATCAAGGCCCTCGTAAAGCCCAAGGAGGTTAAGC 
TGCACGTGCCGAGGCTATCAAGGCCGTCGTAAAGCCCAAGGAGGTTAAGC 



CG57X13-01 NOVlSa 
C657X13-02 NOVlSb 



310 



320 



330 340 



350 



CCAAGATCCCAAAGGGTGTCAGCCGCAAGCTCQATCGACTTGCCTACATT 
CCAAGATCCCAAAGGGTGTCAGCCGCAAGCTCGATCGACTTGCCTACATT 



360 370 380 390 400 

CG57113-01 NOVlSa 
CG57113-02 NOVlSb 



GCCCACCCCAAGCTTGGGAAGCGTGCTCGTGCCCGTATTGCCAAGGGGCT 
GCCCACCCCAAGCTTGGGAAGCGTGCTCGTGCCCGTATTGCCAAGGGGCT 



410 420 430 440 450 

CG57113-01 NOVlSa 
C657113-02 NOVlSb 



CAGGCTGTGCCGGCCAAAGGCCAAGGCCAAGGCCAAgGCCAAGGCCAAGG 
CAGGCTGTGCCGGCCAAAGGCCAAGGCCAAGGCCAi^GCCAAGGCCAAGG 



460 470 480 490 500 

.>..|....|....|....U..>t....|.,,.|,,..|..,.U ... I 

CG57113-01 NOVlSa 
CG57113-02 NOVlSb 



ATCAAACCAAGGCCCAGGCTGCAGCCCCAGCTTCAGTTCCAGCTCAGGCn: 
ATCAAACCAAGGCCCAGGCTGCAGCCCCAGCTTCAGTTCCAGCTCAGGCT 



510 520 530 540 550 

,,.,\.,..\...,\..,,\..,,\....\....\..,,\,,..\,.,,\ 

CG57113-01 NOVlSa 
CG57113-02 NOVlSb 



CCCAAACGTACCCAGGCCCCTACAAAGGCTXCAGAGTAGATATCTCTGCC 
CCCAAACGTACCCAGGCCCCTACAAAGGCTTCAGAGTAGATATCTCTGCC 



560 570 580 590 600 

610 620 630 640 650 

CG57113-01 NOVlSa ACCATCT6CATGGGGCTGGGGTCCTCCTGTGCTACTGGTACAAATAAACC 
CG57113-02 NOVlSb 



660 

C657113-01 NOVlSa T6A66CAGGA 
CG57113-02 HOVlSb 



Homologies to any of the above NOV 1 8 proteins will be shared by the other NOVl 8 
proteins insofar as they are homologous to each other as shown above. Any reference to 
NOVl 8 is assumed to refer to both of the NOVl 8 proteins in general, unless otherwise noted. 

NOVl 8 also has homology to the amino acid sequences shown in the BLASTP data 
listed in Table 18F. 



Table 18F. BLAST results for NOV18 


Gene Index/ 
Xdentifler 


Protein/ Organism 


Length 
(aa) 


Identity 
(%) 


Positives 
(%) 


Expect 



147 



gi| 4506629 |ref|NP_ 
000983. 1| 
(NM_000992) 


ribosomal protein 
L29; 60S ribosomal 
protein L29; 
heparin/heparan 
sulf ate- 
interacting 
protein; HP/HS- 
interacting 
protein; 
heparin/heparan 
su 1 f a t e - b inding 

protein; cell 
surface heparin- 
binding protein 
HIP [Homo sapiens] 


159 


159/161 
(98%) 


159/161 
(98%) 


2e-39 


gi|l3642818|ref IXP 
_018182.l| 
(XM 018182) 


hypothetical 
protein XP_018182 
[Homo sapiens] 


157 


152/161 
(94%) 


153/161 
(94%) 


2e-38 


gi| 13648543 |ref|XP 
_0173 64.l| 
{XM 017364) 


hypothetical 
protein XP_0173 64 
[Homo sapiens] 


155 


151/161 
(93%) 


151/161 
(93%) 


4e-38 


gill082766|pir| |S5 
42 04 


ribosomal protein 
L29 - human 


159 


157/161 
(97%) 


157/161 
(97%) 


6e-37 


gi| 17456336 |ref |XP 
_063630.l| 
(XM_063630) 


similar to 
ribosomal protein 
L29,- 

heparin/heparan 
sulfate 
interacting 
protein (H. 
sapiens) [Homo 
sapiens] 


189 


128/158 
(81%) 


138/158 
(87%) 


7e-37 



The homology of these sequences is shown graphically in the ClustalW analysis 
shown in Table 18G. 



Table 18G, ClustalW Analysis of NOV18 



1) NOVlSa 

2) NOVlSb 

3) gi|4506629 

4) gi [ 13642818 

5) gi j 13648543 

6) gi 1 1082766 

7) gi| 17456336 



(SEQ ID NO: 56) 
(SEQ ID NO: 58) 
(SEQ ID NO; 277) 
(SEQ ID NO: 278) 
(SEQ ID NO: 279) 
(SEQ ID NO: 280) 
(SEQ ID NO: 281) 



NOVlSa 
NOVlSb 

gi 
gi 
gi 
gi 
gi 



45066291 
13642818) 
13648543 I 
1082766 I 
17456336) 



NOVlSa 
NOVlSb 
gi|4506629| 
gi 1 13642818 I 
gi I 13648543 | 
gi 1 1082766 [ 
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9i| 174563361 


61 


NOVlSa 


121 


NOVlSb 


121 


gx {4506629 [ 


121 


cfx 1 13642818 I 


121 


gl j 13648543 { 


121 


gi j 1082766 | 


121 


gi j x/43o^jo 1 


121 


NOV18a 


161 


NOVlSb 


161 


gi 14506629 1 


159 


gij 13642818 1 


157 


gi 113648543 1 


155 


gij 1082766 1 


159 


gi| 174563361 


175 




190 

.|....|....| 

161 

161 

159 

157 

155 

159 

175 CHRHGAGVLLCyLYK 189 



FRVBISVCQREDRRTGATPPG 174 



Table 18H lists the domain description from DOMAIN analysis results against 
NOVl 8. This indicates that the NOVl 8 sequence has properties similar to those of other 
proteins known to contain these domains. 





Table 18H Domain Analysis of NOV18 




gnl |Pfam|pfam01779, Ribosomal_L29e, Ribosomal L29e protein 


family. 




CD-Length = 40 residues, 100.0% aligned 






Score = 48.1 bits (113), Expect = 4e-07 




NOVlSt 3 


KSKNHTTHNQSRKWHRNGIKKPRSQRYESLKGVDPKFLRN 42 

mill iih^i iiiiiiik HI mill 11 ii 


(SEQ ID NO:282) 


Sbjct: 1 


KSKNHTNHNQNKKAHRNGIKKPQKKRYLSLKGVDAKFRRN 4 0 


(SEQ ID NO:283) 



Ribosomal protein L29e forms part of the 60S ribosomal subunit. This family is 

found in eukaryotes. There are there are 20 to 22 copies of the L29 gene in rat. Rat L29 is 

related to yeast ribosomal protein YL43. See InterPro IPR002673. Human ribosomal protein 

L29 has been shown to have the same nucleotide sequence as that of cell surface 

heparin/heparan sulfate-binding protein (Genomics 1997 Nov 15;46(1):148-51). Heparan 

sulfate proteoglycans and their corresponding binding sites have been suggested to play an 

important role during the initial attachment of murine blastocysts to uterine epithelium and 

human trophoblastic cell lines to uterine epithelial cell lines (J Biol Chem 1996 May 

17;271(20):1 1817--23). Heparin/heparan sulfate interacting protein (HIP) has been shovsoi to 

be up-regulated in colorectal carcinoma. HIP is a candidate marker of abnormal cell growth 

in the colon and a prognostic marker for colorectal carcinoma. (Cancer Res 1999 Jun 

15;59(12):2989-94). Therefore it is likely that this novel ribosomal protein L29-'Iike protein 

may play roles in blastocyst attachment and in tumorigenesis. 
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The protein synthesis reactions require a complex catalytic machinery to guide them. 
The growing end of the polypeptide chain, for example, must be kept in register with the 
mRNA molecule to ensure that each successive codon in the mRNA engages precisely with 
the anticodon of a tRNA molecule and does not slip by one nucleotide, thereby changing flie 
reading frame. This precise movement and the other events in protein synthesis are catalyzed 
by ribosomes, which are large complexes of RNA and protein molecules. Eucaryotic and 
procaryotic ribosomes are very similar in design and function. Both are composed of one 
large and one small subunit that fit together to form a complex with a mass of several million 
daltons. The small subunit binds the mRNA and tRNAs, while the large subunit catalyzes 
peptide bond formation. More than half of the weight of a ribosome is RNA, and there is 
increasing evidence that the ribosomal RNA (rRNA) molecules play a central part in its 
catalytic activities. Ribosomes contain a large number of proteins, but many of these have 
been relatively poorly conserved in sequence during evolution. 

During the large scale partial sequencing of human heart cDNA clones, a novel clone 
which is very similar to the rat ribosomal protein L29 in both DNA and amino acid sequences 
has been found. The cDNA encodes a protein with a deduced molecular weight of 17751 
(159 aa). It shows 80.4% homology to protein L29 from the large ribosomal subunit of rat 
and is related to yeast YL43. The putative protein has been named human ribosomal protein 
L29 (hRPL29). hRPL29 has a large excess of basic residues over acidic ones. The large 
amount of charged residues makes the protein very hydrophilic and the protein has a deduced 
pi of 12.16. Internal repeats have been characterized in many ribosomal proteins and a 
tandem repeat of KAKAKAKA (SEQ ID NO:284) was found to be unique to hRPL29. 
Northern analysis indicated that the mRNA that encodes human L29 is approx. 800 base pairs 
in length. An intron of hrpL29 has also been cloned and sequenced by polymerase chain 
reaction using human genomic DNA as the template. 

By somatic cell hybrid analysis, radiation hybrid mapping, and fluorescence in situ 
hybridization, hRPL29 has been located on the telomeric region of the q arm of chromosome 
3. hRPL29 is the most distal marker of the long arm of chromosome 3. Of the human 
ribosomal protein genes mapped, hRPL29 is the shortest distance from another ribosomal 
protein gene marker, hRPL35 a which has also been mapped to the 3q29-qter region. The 
human ribosomal protein L29 has been subsequently shown to have the same nucleotide 
sequence as that of cell surface heparin/heparan sulfate-binding protein, designated HP/HS 
interacting protein (HIP). Transfection of HIP full-length cDNA into NIH-3T3 cells 
demonstrates cell surface expression and a size similar to that of HIP expressed by human 
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cells. Predicted amino acid sequence indicates that HIP lacks a membrane spanning region 
and has no consensus sites for glycosylation. Northern blot analysis detects a single transcript 
of 1 .3 kilobases in both total RNA and poly(A+) RNA. Examination of human cell lines and 
normal tissues using both Northern blot and Western blot analyses reveals that HIP is 
expressed at different levels in a variety of human cell lines and normal tissues but absent in 
some cell lines and some cell types of normal tissues examined. Thus, members of the L29 
family may be displayed on cell surfaces where they may participate in HP/HS binding 
events. Heparan sulfate proteoglycans and their corresponding binding sites have been 
suggested to play an important role during the initial attachment of murine blastocysts to 
uterine epithelium and human trophoblastic cell lines to uterine epithelial cell lines. 

The protein similarity information, expression pattern, cellular localization, and map 
location for the protein and nucleic acid disclosed herein suggest that this ribosomal protein 
L29-like protein may have important structural and/or physiological functions characteristic 
of the ribosomal L29e proteins family. Therefore, the nucleic acids and proteins of the 
invention are useful in potential diagnostic and therapeutic applications and as a research 
tool. These include serving as a specific or selective nucleic acid or protein diagnostic and/or 
prognostic marker, wherein the presence or amount of the nucleic acid or the protein are to be 
assessed. These also include potential therapeutic applications such as the following: (i) a 
protein therapeutic, (ii) a small molecule drug target, (iii) an antibody target (therapeutic, 
diagnostic, drug targeting/cytotoxic emtibody), (iv) a nucleic acid useful in gene therapy (gene 
delivery/gene ablation), (v) an agent promoting tissue regeneration in vitro and in vivo, and 
(vi) a biological defense weapon. 

The nucleic acids and proteins of the invention have applications in the diagnosis 
and/or treatment of various diseases and disorders. For example, the compositions of the 
present invention may have efficacy for the treatment of patients suffering from cancer, 
especially colorectal carcinoma as well as other diseases, disorders and conditions. 

These materials are further useful in the generation of antibodies that bind 

immunospecifically to the novel substances of the invention for use in therapeutic or 

diagnostic methods. These antibodies may be generated according to methods known in the 

art, using prediction from hydrophobicity charts, as described in the "Anti-NOVX 

Antibodies" section below. The disclosed NOV 18 protein has multiple hydrophilic regions, 

each of which can be used as an immunogen. In one embodiment, a contemplated NOV 18 

epitope is from about amino acids 1 0 to 25. In another embodiment, a contemplated NOVl 8 

epitope is from about amino acids 45 to 62. In other specific embodiments, contemplated 
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NOV18 epitopes are from about amino acids 70 to 75, 78 to 82, 90 to 95, llOto 112, 118to 
125 and 140 to 145 



NOV19 

A disclosed NOV19 (designated CuraGen Acc. No. CG5721 1-01), which encodes a 
5 novel Metalloproteinase-Disintegrin (ADAM30)-like protein and includes the 1 143 

nucleotide sequence (SEQ ID NO:59) is shown in Table 19A. An open reading frame for the 
mature protein was identified beginning with an ATG initiation codon at nucleotides 1-3 and 
ending with a TAA stop codon at nucleotides 1 141 -1 1 43. The start and stop codons are in 
bold letters in Table 19A. 

10 

Table 19A, NOV19 Nucleotide Sequence (SEQ ID NO:59) 

AT6AGGTCAGTGCAGATCTTCCTCTCCCAATGCCGTTTGCTCCTTCTACTAGTTCCGACAATGCTCC 
TTAAGTCTCTTGGCGAAGATGTAATTTTTCACCCTGAAGGGGAGTTTGACTCGTATGAAGTCACCAT 
TCCTGAGAAGCTGAGCTTCCGGGGAGAGGTGCAGGGTGTGGTCAGTCCCGTGTCCTACCTACTGCAG 
TTAAAAGGCAAGAAGCACGTCCTCCATTTGTGGCCCAAGAGACTTCTGTTGCCCCGACATCTGCGCG 
TTTTCTCCTTCACAGAACATGGGGAACTGCTGGAGGATCATCCTTACATACCAAAGGACTGCAACTA 
CATGGGCTCCGTGAAAGAGTCTCTGGACTCTAAAGCTACTATAAGCACATGCATGGGGGGTCTCCGA 
GGTGTATTTAACATTGATGCCAAACATTACCAAATTGAGCCCCTCAAGGCCTCTCCCAGTTTTGAAC 
ATGTCGTCTATCTCCTGAAGAAAGAGCAGTTTGGGAATCAGGCAGAAAATCTCATGTGCTGGGGCAC 
AGGCTATCATCTATCCATGAAACCCATGGGAATACCTGACCTAGGTATGATAAATGATGGCACCTCC 
TGTGGAGAAGGCCGGGTATGTTTTAAAAAAAATTGCGTCAATAGCTCAGTCCTGCAGTTTGACTGTT 
TGCCTGAGAAATGCTy^TACCCGGGGTGTTTGCAACAACAGAAAAAGCTGCCACTGCATGTATGGGT^ 
GGCACOTCCATTCTGTGAGGAAGTGGGGTATGGAGGAAGCATTGACAGTGGGCCTCCAGGACTGCTC 
AGAGGGGCGATTCCCTCGTCAATTTGGGTTGTGTCCATCATAATGTTTCGCCTTATTTTATTAATCC 
TTTCAGTGGTTTTTGTGTTTTTCCGGCAAGTGATAGGAAACCACTTAAAACCCAAACAGGAAAAAAT 
GCCACTATCCAAAGCAAAAACTGAACAGGAAGAATCTAAAA.CAAAAACTGTACAGGAAGAATCTAAA 
ACAAAAACTGGACAGGAAGAATCTGAAGCAAAAACTGGACAGGAAGAATCTAAAGCAAAAACTGGAC 
AGGAAGAATCTAAAGCAAACATTGAAAGTAAACGACCCAAAGCAAAGAGTGTCAAGAAACAAAAAAA 
GTAA 

The disclosed NOV 19 nucleic acid sequence maps to chromosome 1 and has 635 of 
636 bases (99%) identical to a gb:GENBANK-ID:AF171932|acc:AF171932.1 mRNA from 
Homo sapiens (Homo sapiens metallaproteinase-disintegrin (ADAM30) mRNA, complete 
15 cds)(E=L5e*^^^). 

A disclosed NOV 19 polypeptide (SEQ ID NO:60) is 380 amino acid residues in 
length and is presented using the one-letter amino acid code in Table 19B. The SignalP, 
Psort and/or Hydropathy results predict that NOVl 9 has a signal peptide and is likely to be 
localized to the plasma membrane with a certainty of 0.4600. In alternative embodiments, a 
20 NOVl 9a polypeptide is located to the endoplasmic reticulum (membrane) with a certainty of 
0.1000, the endoplasmic reticulum (lumen) with a certainty of 0.1000, or the outside of the 
cell with a certainty of 0. 1 000. The SignalP predicts a likely cleavage site for a NOV 1 9 
peptide between amino acid positions 27 and 28, i,e, at the sequence SLG-ED, 
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Table Encoded NOV19 Protein Sequence (SEQ ID NO:60) 

HLWPKRLLLPKHLRVFSFTEHGELLEDHPYIPKDCNYMGSVKESIJ^SKATISTCM^ 

ASPSFEHVVYLLKKBQF<aiQAENIMCWGTGYHLSMKPMGIPDI/»5INTC 

EKCOTRGVOMISnaCSCHCmGWAPPFCEEVGYGGSIDSGPPGLI^ 

VIGNHLKPKQEKMPI.SKAKTEQEESKTKTVQEESKTKTGQEESEAKTGQEESKAKTGQBESK^ 
KKQKK 

The NOV 19 amino acid sequence was found to have 210 of 21 1 amino acid residues 
(99%) identical to, and 21 1 of 21 1 amino acid residues (100%) similar to, the 790 amino acid 
residue ptnr:SPTREMBL-ACC:Q9UKF2 protein from Homo sapiens (Human) 
(METALLAPROTEINASE-DISINTEGRIN) (E - 2.3e ^^^). 

NOV 19 is expressed in at least the following tissues: Adrenal Gland/Suprarenal 
gland. Prostate, Testis, and Whole Organism. Expression information was derived from the 
tissue sources of the sequences that were included in the derivation of the sequence of 
CuraGen Acc. No. CG572 11-01. The sequence is predicted to be expressed in testis because 
of the expression pattern of (GENBANK-ID: gb:GENBANK- 
ID:AF171932|acc:AF171932.1), a closely related Homo sapiens metallaproteinase- 
disintegrin (ADAM30) mRNA, complete cds homolog in species Homo sapiens. 

Homologies to any of the above NOV 19 proteins will be shared by the other NOV 19 
proteins insofar as they are homologous to each other as shown above. Any reference to 
NOV 19 is assumed to refer to both of the NOV 19 proteins in general, unless otherwise noted. 

Possible small nucleotide polymorphisms (SNPs) found for NOV 19 are listed in 
Table 19C. 



Table 19C: SNPs 


Variant 


Nucleotide 
Position 


Base Change 


Amino Acid 
Position 


Base Change 


13376670 


166 


OT 


56 


Gln>End 


13376669 


167 


A>G 


56 


Gln>Arg 


13376668 


353 


A>G 


118 


Glu>Gly 


13376667 


440 


A>G 


147 


Glu>Gly 


13376662 


701 


G>A 


234 


Cys>Tyr 


13376661 


736 


T>C 


246 


Trp>Arg 


13376660 


979 


A>G 


327 


Thr>Ala 


13376659 


989 


1>A 


330 


Val>Glu 
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NOV19 also has homology to the amino acid sequences shown in the BLASTP data 
listed in Table 19D. 



Table 19D. BLAST results for NOV19 


Gene Index/ 
Xdentif ier 


Protein/ Organism 


Length 
(aa) 


Identity 
<%) 


Positives 
(%) 


Expect 


gi 1 11497609 |ref | 
NP_068566.l| 
{NM_021794) 


a disintegrin and 
metalloproteinase 
domain 30, 
isoform 1 
preproprotein 
[Homo sapiens] 


790 


200/201 
(99%) 


201/201 
(99%) 


e-118 


gi 1 9966785 |ref In 
P_065067,l| 
(KM 020334) 


a disintegrin and 
met all opr o t e inas e 
domain 30, isoform 
2 preproprotein 
[Homo sapiens] 


781 


191/201 
(95%) 


191/201 
(95%) 


e-111 


gi|9966766|ref [N 
P_065063.l| 
{MM_020330) 


a disintegrin and 
metalloprotease 

domain 21; a 
disintegrin and 
metalloprotease 
domain (ADAM) 21 
[Mus mus cuius] 


729 


68/142 
(47%) 


87/142 
(60%) 


2e-31 


gi 1 14749466 |ref| 
XP__016158.2 1 
(XM_016158) 


a disintegrin and 
metalloproteinase 
domain 21 
preproprotein 
[Homo sapiens] 


722 


64/137 
(46%) 


82/137 
(59%) 


2e-31 


gi [114 97040 |ref| 
NP_003804.1| 
(NM_003813) 


a disintegrin and 
metalloproteinase 
domain 21 
preproprotein 
[Homo sapiens] 


722 


64/137 
(46%) 


82/137 
(59%) 


2e-31 



The homology of these sequences is shown graphically in the ClustalW analysis 
shown in Table 19E. 



1) N0V19 



Table 19E. ClustalW Analysis of NOV19 

(SEQ ID NO: 60) 



2) gi(ll497609 (SEQ ID NO:285) 

3) gi 19966785 (SEQ ID NO:286) 

4) gi I 9966766 (SEQ ID NO:287) 

5) gi 114749466 {SEQ ID NO:288) 

6) gij 11497040 (SEQ ID NO:289) 




11457609 
9966785 I 
99667661 
14749466 
11497040 



NOV19 

gi 1 11497609 
gii 9966785 I 
gi I 99667 66 I 
gi 1 14749466 
gij 11497040 



NOV19 

gi 111497609 
gii9966785 | 
gi j9966766| 
gij 14749466 
gi I 11497040 



N0V19 

gi 1 11497609 
gij 9966785 I 
gi [99667661 
gi [14749466 
gi I 11497040 



N0V19 



11497609 
9966785 | 
99667661 
14749466 
11497040 



NOV19 

gi 1 11497609 
gij 9966785 I 
gij 9966766 I 
gij 14749466 
gij 11497040 



N0V19 

gi 1 11497609 
gij 9966785 | 
gi j 9966766 j 
gij 14749466 
gi I 11497040 



N0V19 

gi I 11497609 
gij 9966785 I 
gi[9966766 j 
gij 14749466 
gij 11497040 



N0V19 




310 



320 



340 



360 




JLEyAGSgSTLLDTNIIiA: 

ppldcgSenfqgdawsl: 

PPIDCGfcNFQGDTWSL: 
PIDCGp)NFQCT>TWSIi: 





169 



550 



560 



570 



580 



590 



174 



155 



gi 111497609 
gi I 9966785 I 
gi I 9966766 | 
gi [14749466 
g± [1X497040 



N0V19 

gi I 11497609 
gi 19966785 1 
gij 9966766 1 
gi I 14749466 
gij 11497040 



NOV19 


235 


gi| 


114976091 


645 


gi 19966785 1 


645 


gil 


9966766] 


650 


gi| 14749466 1 


646 


gij 114970401 


646 


HOV19 


294 


gil 11497609! 


704 


gi 


9966785) 


704 


gi 


9966766) 


710 


gi 


14749466 1 


704 


gi 


11497040 1 


704 


N0V19 


354 


gi 


11497609| 


764 


gi 


9966785] 


762 


gi 


9966166 1 


729 


gi 


14749466) 


722 


gi 


114970401 


722 




740 



750 



780 




760 770 

IQEESKTKTVQBBSKTKTGQEESEAKTGQEESK 353 
IQEBSKTKTVQEBSKTKTGQEESEAKTGQEBSK 763 
lEQEBSKTKTVQBESKTKTGQEESEAKTGQEBS- 762 

ipG 729 

722 

722 



790 



800 



-KANIESKRPKAKSVKKQKK 781 

729 

722 

722 



Table 19F lists the domain description from DOMAIN analysis results against 
NOV19. This indicates that the NOV19 sequence has properties similar to those of other 
proteins known to contain these domains. 
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Table 19F Domain Analysis of NOV19 

gnl|Pfam|p£am01562, Pep_M12B_j>rqpep, Reprolysin family propeptide. This region 
is the propeptide for members of peptidase family M12B. The propeptide contains 
a sequence motif similar to the "cysteine switch" of the matrixins. This motif 
is found at the C terminus of the alignment but is not well aligned. 

CD- Length = 117 residues, only 71.8% aligned 

Score = 90.1 bits (222), Es^ect = 2e-19 

NOV19: 76 HLWPKKLIiPRHIiRWSFTEHGELLEDHPYIPKDCNYMGSVKESIJ^SKATISTCMGG^ 135 

II III I + I 1+ +111 I i 11+ +1 ++1II INI 

Sbjct: 1 HLEKNRSLLAPDFTVTTYDDDGTLWEHPLIQDHCyyQGYVEGYPNSAVSLSTC-S 59 

NOV19: 136 VFNIDAKHYQIEPLKASPSFEHWY 160 (SEQ ID NO:290) 

+ ++ I illk+l llk+l 

Sbjct: 60 ILQLENLSYGIEPLESSDGFEHIIY 84 <SEQ ID NO:291) 



gnl|Smart|smart00608, ACR, ADAM Cysteine-Rich Domain 
CD-Length = 139 residues, 29.5% aligned 
Score = 55.5 bits (132), Expect = 6e-09 

NOV19: 173 NLMCWGTGYHLSMKPMGIPDLGMINDGTSCGEGRVCFKKNCVNS 216 {SEQ ID NO: 292) 

[+11 III 111111+ Hi II i+ii 11+ 

Sbjct: 99 GLVCWSLDYHLGSD IPDLGMVKDGTKCGPGKVCINGQCVDV 139 (SEQ ID NO: 2 93) 



A sequence of about thirty to forty amino-acid residues long found in the sequence of 
epidermal growth factor (EGF) has been shown, to be present, in a more or less conserved 
form, in a large number of other, mostly animal proteins. The list of proteins currently known 
to contain one or more copies of an EGF-like pattern is large and varied. The functional 
significance of EGF domains in what appear to be unrelated proteins is not yet clear. 
However, a common feature is that these repeats are found in the extracellular domain of 
membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H 
synthase). The EGF domain includes six cysteine residues which have been shown (in EGF) 
to be involved in disulfide bonds. The main structure is a two-stranded beta-sheet followed 
by a loop to a C-terminal short two-stranded sheet. Subdomains between the conserved 
cysteines vary in length. See InterPro IPR000561 : EGF. 

This indicates that the sequence of the invention has properties similar to those of 
other proteins known to contain this/these domain(s) and similar to the properties of these 
domains. 

ADAMs are a family of cell surface proteins with a domain structure composed of a 
signal sequence, a prodomain with a cysteine switch, a metalloproteinase-like domain, a 
disintegrin-like domam, a cysteine-rich domain, a transmembrane domain, and a C-terminal 
cytoplasmic domain. Members of this family have been implicated in a variety of biologic 
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processes involving cell-cell and cell-matrix interactions, including fertilization, muscle 
development, and neurogenesis. 

By searching a DNA sequence database, Cerretti et al (1999) identified 2 ESTs 
representing the novel ADAMs ADAM29 (604778) and ADAM30. The ADAM30 EST 
encodes a polypeptide with sequence similarity to the cysteine-rich region of ADAM21 
(603713). Cerretti et al (1999) screened a human testis cDNA library with the ADAM30 
EST and isolated cDNAs encoding 2 forms of ADAM30 that differ in the cytoplasmic 
domain. The first predicted ADAM30 protein has 790 amino acids and contains all of the 
domains characteristic of ADAMs. The metalloproteinase domain of ADAM30 has a 
consensus zinc-binding motif, suggesting that ADAM30 is proteolytically active. The second 
form of ADAM30, which the authors called ADAM30-beta, has a deletion of 9 amino acids 
in its cytoplasmic domain compared to the first form, resulting in a protein with 781 amino 
acids. Northern blot analysis of a variety of human tissues detected an approximately 3-0-kb 
ADAM30 transcript only in testis. 

The protein similarity information, expression pattern, cellular localization, and map 
location for the NOVl 9 protein and nucleic acid disclosed herein suggest that this 
Metallaproteinase-disintegrin (ADAM30)-like protein may have important structural and/or 
physiological functions characteristic of the ADAM family. Therefore, the nucleic acids and 
proteins of the invention are usefiil in potential diagnostic and therapeutic applications and as 
a research tool. These include serving as a specific or selective nucleic acid or protein 
diagnostic and/or prognostic marker, wherein the presence or amoimt of the nucleic acid or 
the protein are to be assessed. These also include potential therapeutic applications such as 
the following: (i) a protein therapeutic, (ii) a small molecule drug target, (iii) an antibody 
target (therapeutic, diagnostic, drug targbting/cytotoxic antibody), (iv) a nucleic acid useful in 
gene therapy (gene delivery/gene ablation), (v) an agent promoting tissue regeneration in 
vitro and in vivo, and (vi) a biological defense weapon. 

The NOV 19 nucleic acids and proteins of the invention have applications in the 
diagnosis and/or treatment of various diseases and disorders. For example, the compositions 
of the present invention will have efficacy for the treatment of patients suffering from: 
fertility problems, adrenoleukodystrophy, congenital adrenal hyperplasia as well as other 
diseases, disorders and conditions. 

These materials are further useful in the generation of antibodies that bind 

immunospecifically to the novel substances of the invention for use in therapeutic or 

diagnostic methods. These antibodies may be generated according to methods known in the 
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art, using prediction from hydrophobicity charts, as described in Ae "Anti-NOVX 
Antibodies" section below. The disclosed NOV 19 protein has multiple hydrophilic regions, 
each of which can be used as an immunogen. In one embodiment, a contemplated NOV 19 
epitope is from about amino acids 40 to 50. In another embodiment, a contemplated NOV 19 
epitope is from about amino acids 60 to 65. In other specific embodiments, contemplated 
NOV19 epitopes are from about amino acids 90 to 120, 140 to 152, 160 to 190, 195 to 205, 
220 to 245, 249 to 252 and 310 to 370. 

NOV20 

A disclosed NOV20 (designated CuraGen Acc. No. CG57222-01), which encodes a 
novel Bone Morphogenetic Protein-like protein and includes the 1207 nucleotide sequence 
(SEQ ID NO:61) is shown in Table 20A. An open reading frame for the mature protein was 
identified beginning with an ATG initiation codon at nucleotides 54-56 and ending with a 
TAA stop codon at nucleotides 1089-1091. Putative untranslated regions are underlined in 
Table 20A, and the start and stop codons are in bold letters. 



Table 20A. NOV20 Nucleotide Sequence (SEQ ID NO:61) 

CCGCGGGACTCCGGCGTCCCCGCCCCCCAGTCCTCCCTCCCCTCCCCTCCAGCA TGGTGCTCGCGGCC 

CCGCTGCTGCTGGGCTTCCTGCTCCTCGCCCTGGAGCTGCGGCCCCGGGGGGAGGCGGCCGAGGGCCC 

CGCGGCGGCGGCGGCGGCGGCGGCGGCGGCGGCAGCGGCGGGGGTCGGGGGGGAGCGCTCCAGCCGGC 

CAGCCCCGTCCGTGGCGCCCGAGCCGGACGGCTGCCCCGTGTGCGTATGGCGGCAGCACAGCCGCGAG 

CTGCGCCTAGAGAGCATCAAGTCGCAGATCTTGAGCAAACTGCGGCTCAAGGAGGCGCCCAACATCAG 

CCGCGAGGTGGTGAAGCAGCTGCTGCCCAAGGCGCCGCCGCTGCAGCAGATCCTGGACCTACACGACT 

TCCAGGGCGACGCGCTGCAGCCCGAGGACTTCCTGGAGGAGGACGAGTACCACGCCACCACCGAGACC 

GTCATTAGCATGGCCCAGGAGACGGACCCaGCAGTACAGACAGATGGCAGCCCTCrCTGCTGCC^ 

TCACTTCAGCCCCAAGGTGATGTTCTVCAAAGAGCaTCGACTTC^ 

GCCAGCCACAGAGCAACrGGGGCATCGAGATCAACGCCTTTGATCCCAGTGGCACAGACCTGGCTGTC 
ACCTCCCTGGGGCCGGGAGCCGAGGGGCTGCATCCATTCATGGAGCTTCGAGTCCTAGAGAACACAAA 
ACGTTCCCGGCGGAACCTGGGTCTGGACTGCGACGAGCACTCAAGCGAGTCCCGCTGCTGCCGATATC 
CCCTCACAGTGGACTTTGAGGCTTTCGGCTGGGACTGGATCATCGCACCTAAGCGCTACAAGGCCAAC 
TACTGCTCCGGCCAGTGCGAGTACATGTTCATGCAAAAATATCCGCATACCCATTTGGTGCAGCAGGC 
CAATCCAAGAGGCTCTGCTGGGCCCTGTTGTACCCCCACCAAGATGTCCCCAATCAACATGCTCTACT 
TCAATGACAAGCAGCAGATTATCTACGGCAAGATACCTGGCATGGTGGTGGATCGCTGTGGCTGCTCT 
TA AGTGGGTCACTACAAGCTGCTGGAGCAAAGACTTGGTGGGTGGGTAACTTAACCTCTTCACAGAGG 
ATAAAAAATGCTTGTGAGTATGACAGAAGGGAATAAACAGGCTTAAAGGGT 

The disclosed NOV20 nucleic acid sequence maps to chromosome 12 and has 597 of 
629 bases (94%) identical to a gb:GENBANK-ID:AF100907|acc:AF 100907.1 mRNA from 
Homo sapiens (Homo sapiens bone morphogenetic protein 1 1 (BMPl 1) mRNA, complete 
cds)(E = 2.3e-^^'). 

A disclosed NOV20 polypeptide (SEQ ID NO:62) is 345 amino acid residues in 
length and is presented using the one-letter amino acid code in Table 20B. The SignalP, 
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Psort and/or Hydropathy results predict that NOV20 has a signal peptide and is likely to be 
localized to the outside of the cell with a certainty of 0.8200. In alternative embodiments, a 
NOV20a polypeptide is located to the endoplasmic reticulum (membrane) with a certainty of 
0.1 000, the endoplasmic reticulum (lumen) with a certainty of 0.1 000, or the microbody 
(peroxisome) with a certainty of O.IOOO. The SignalP predicts a likely cleavage site for a 
NOV20 peptide between amino acid positions 24 and 25, Le. at the sequence GEA-AE. 



Table 20B, Encoded NOV20 Protein Sequence (SEQ ID NO:62) 

MVI^PLLLGFLLIJyLELRPRGEAAEGPAAAAAAAAAAAAAGVGGERSSRPAPSVAPEPDGCPVCVWRQHSRE 
ESI KSQ ILS KLRLKEAPNISREVVKQLLPKAPPLQQILDLHDFQGDALQPEDFLEEDEYHATTETVI SMAQETDPA 
VQTDGSPLCCHFHFSPKVMFTKSIDFKQVLHSWFRQPQSNWGIEINAFDPSGTDIAVTSLGPGAEGLHPFMELRV^ 
ENTKRSRRNLGLDCDEHSSESRCCRYPLTVDFEAFGWDWIIAPKRYKANYCSGQCEYMFMQKYPHTm 
SAGPCCTPTKMSPINMLYFNDKQQIIYGKIPGMWDRCGCS 

The NOV20 amino acid sequence was found to have 1 71 of 1 72 amino acid residues 
(99%) identical to, and 172 of 172 amino acid residues (100%) sunilar to, the 407 amino acid 
residue ptnr:SWISSNEW-ACC:O95390 protein from Homo sapiens (Human) 
(GROWTH/DIFFERENTIATION FACTOR-1 1 PRECURSOR (BONE MORPHOGENETIC 
PROTEIN 1 1)) (E == 2.5e'^**). 

NOV20 is expressed in at least the following tissues: muscle, neural and uterine cells. 
Expression information was derived from the tissue sources of the sequences that were 
included in the derivation of the sequence of NOV20. 

Possible small nucleotide polymorphisms (SNPs) found forNOV20 are listed in 
Table 20C. 



Table 20C: SNPs 


Variant 


Nucleotide 
Position 


Base Change 


Amino Acid 
Position 


Base Change 


13377014 


460 


A>G 


136 


His>Arg 


13374718 


591 


OT 


180 


Gln>End 


13377008 


702 


OA 


217 


Glu>Lys 


13377013 


725 


G>A 


NA 


NA 


13377012 


747 


A>G 


232 


Lys>Glu 


13377011 


870 


OT 


273 


Arg>Cys 


13377009 


1013 


OA 


320 


Met>Ile 


13377010 


896 


OT 


NA 


NA 
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Homologies to any of the above NOV20 proteins will be shared by the other NOV20 
proteins insofar as they are homologous to each other as shown above. Any reference to 
NOV20 is assumed to refer to all of the NOV20 proteins in general, unless otherwise noted. 

NOV20 also has homology to the amino acid sequences shown in the BLASTP data 
listed in Table 20D. 



Table 20D. BLAST results for NOV20 


Gene Index/ 
Identifier 


Protein/ Organism 


Length 
(aa) 


Identity 
(%) 


Positives 
(%) 


Expect 


gi| 6649914 |gb|AA 
F21630.l|AF02833 
3 1 (AF028333) 


growth/dif ferentia 
tion factor- 11 
EHomo sapiens] 


379 


306/379 
(80%) 


309/379 
(80%) 


e-162 


gi 1 5031613 |ref|N 
P_005802.l| 
(3SIM__O0581l) 


growth 
differentiation 
factor 11; bone 

morphogene t i c 
protein 11 [Homo 
sapiens] 


407 


334/407 
(82%) 


337/407 
(82%) 


e-158 


gi[ 13124273 |sp|Q 
9Z1W4 |GDFB_MOUSE 


GROWTH/D IFFERENTI A 
TION FACTOR 11 
PRECURSOR (BONE 
MORPHOGENETI C 
PROTEIN 11) 


405 


323/407 
(79%) 


326/407 
<79%) 


e-155 


gi [6649923 [gbjAA 
F21633.l| 

{AF028337) 


growt 3i/ di f f er en t i a 
tion factor- 11; 
GDF-11 [MUS 
musculus] 


405 


322/407 
(79%) 


325/407 
(79%) 


e-155 


gi [13124255 [splQ 
9Z217|6DFB_RAT 


Growth/di f f erent ia 
tion factor 11 
precursor (Bone 
morpliogenet i c 
protein 11) 


345 


267/345 
(77%) 


271/345 
(78%) 


e-146 



The homology of these sequences is shown graphically in the ClustalW analysis 
shown in Table 20E. 



Table 20E. ClustalW Analysis of NOV20 



1) NOV20 {SEQ ID NO: 62) 

2) gi I 6649914 (SEQ ID NO r 294) 

3) gi 15031613 (SEQ ID NO:295) 

4) gij 13124273 (SEQ ID NO:296) 

5) gij 6649923 (SEQ ID NO:297) 

6) gij 13124255 (SEQ ID N0:298) 



NOV20 

gi I 6649914 I 
gi 1 5031613 | 
gij 13124273 I 
gi I 66499231 
gij 13124255 I 



MVfAAPLLBGFLLgAgEjjRpgGEAAEGP. 



ilBprgeaaeg: 

ILgPRGEAAEG] 





70 



80 



90 



100 



110 



120 
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NOV20 

gl I 6649914 I 
gi 15031613 I 
gi 113124273 1 
gi I 6649923 I 
gl I 131242551 



NOV20 

gi I 6649914 | 
gi 15031613 j 
gi 1 13124273 I 
gi 16649923 | 
gi 1 13124255 I 



I . 



:gcpvcvwrqhsrelrlesiksqilsklrlkeapnisrswkqllpkapplqqildlhdfq 

iGCPVCVvTRQHSRELRLSSIKSQILSKLRLKEAPNISREWKQLLPKAPPLQQILDLKDFQ 
GCPvC\/'WQHSRELRLESIKSQILSKLRLKSAPNISREWKQLLPKAPFLQQILDLHDFG 
GCPVO/VJRQHSRELRLESIKSQILSKLRLKSAPNISREWKQLLPKAPPLQQILDLKDFG 
GCPVCVWRQHSRELRLESIKSQILSKLRLKEAPNISREWKQLLPKAPPLQQILDLHDFQ 
GCPVCVWRQHSRiSRLSsiKSQILSKLRLKEAPNISREWKQLLPKAPPLQQILDLHDFQ 




NOV20 


174 


gi 


6649914 1 


153 


gi 


5031613 1 


181 


gi 


131242731 


179 


gi 


6649923 | 


179 


gi 


13124255) 


125 



NOV20 


179 


gi| 6649914 | 


213 


gi 15031613 | 


241 


gi| 13124273 | 


239 


gi 16649923 ( 


23 9 


gij 13124255 1 


185 


NOV20 


239 


gi| 66499141 


273 


gij 50316131 


301 


gij 131242731 


299 


gi j 6649923 | 


299 


gij 13124255) 


245 


NOV20 


299 


gi 1 6649914 | 


333 


gij 5031613 j 


361 


gij 13124273 j 


359 


gij 6649923 ) 


359 


gij 13124255) 


305 



190 



200 



210 



I 



22 0 



230 



240 



.^iVYLRPVPRPATVyLQILRLKPLTGEGTAGGGGGGRRKIRIRSLKIELHSRSGKWQSIDF 
WVYLRPVPRPATVYLQILRLKPLTGEGTAGGGGGGRRHIRIRSLKIELHSRSGHWQSIDF 
;\-VYLRPVPRPATVYLQILRLK?LTGEGTAGGGGGGRRHIRIRSLKIELKSRSGKWQSIDF 
vnVYLRPVPRPATWLQILRLKPLTGEGTAGGGGGGRRHIRIRSLKIELHSRSGHWQSIDF 
WVYLRPVPRPATVYLQILRLKPLTGEGTAGGGGGGRRHIRIRSLKIELKSRSGKWQSIDF 



178 
212 
240 
238 
238 
184 



250 



260 



270 



1, 



1^ 



280 



290 



300 



KQVLHSWFRQPQSNWGIEINAFDPSGTDLAVTSLGPGASGLKPFMELRVLSNTKRSRRNL 
KQVLKSWFRQPQSN^NiGIEINAFDPSGTDLAVTSLGPGAEGLHPFMELRVLSNTKRSRRNL 
KQVLHSWFRQPQSNWGIEINAFDPSGTDLAVTSLGPGAEGLKPFMELRVLSNTKRSRRNL 
KQVLHSWFRQPQSNWGIEINAFDPSGTDLAVTSLGPGAEGLHPFMELRVLSNTKRSRRNL 
KQ^/LHSWFRQ?QSN\vGIEINAFDPSGTDLAVTSLGPGAEGLKPFMELRVLENTKRSRRNL 

kqvlhswfrqpqsnwgieinafdpsgtdlavtslgpgaegShpfmslrvlsntkrsrrnl 



238 
272 
300 
298 
298 
244 



J. 



310 



J 



320 



I 



330 



340 



I 



350 



I 



360 



gldcdehssesrccrypltvdfeafgwdwiiapkrykanycsgqceymfmqkyphthlvc 

GLDCDEHSSESRCCRYPLT\nDFEAFGWDWIIAPKRYKANyCSGQCEyMFMQKYPHTKLVQ 

gldcdehssesrccryplt^/dfeafgwdwiiapkrykanycsgqceymfmqkypktklvq 

GLDCDEKSSESRCCRYPLTVT)FEAFGI\T:3WIIAPKRYKANYCSGQCEYMFMQKYPHTKLVQ 

gldcdehssesrccrypltvdfeafgwdwiiapkrykanycsgqceymfiviqkyphthlvc 
gldcdehssesrccrypltvdfeaSgwdwiiapkrykanycsgqcey>ifmqkyphthlvc 



298 
332 
360 
358 
358 
304 



370 



380 



390 



1. 



400 



qanprgsagpcctptkmspinmlyfndkqqiiygkipgmwdrcgcs 
qanprgsagpcctptkmspinivilyfndkqqiiygkipgmwdrcgcs 

QANPRGSAGPCCTPTKMSPINMLYFNDKQQIIYGKIPGrWVDRCGCS 
QANPRGSAGPCCTPTKMSPINMLYFNDKQQIIYGKIPGiWVDRCGCS 
QANPRGSAGPCCTPTKMSPINiyiLYFNDKQQIIYGKIPGMWDRCGCS 
QANPRGSAGPCCTPTKMSPINMLYFNDKQQIIYGKIPGMW' 



345 
379 
407 
405 
405 
345 



Table 20F lists the domain description from DOMAIN analysis results against 
NOV20. This indicates that the NOV20 sequence has properties similar to those of other 
proteins known to contain these domains. 



Table 20F Domain Analysis of NOV20 

gnl |Smart|smart00204, TGFB, Transforming growth factor-beta (TGF-beta) family; 
Family members are active as di sulphide -linked homo- or heterodimers . TGFB is a 
TOultifxinctional peptide that controls proliferation, differentiation, and other 
functions in many cell types. 

CD-Length = 102 residues, 100.0% aligned 
Score = 131 bits (329)^ Expect = 7e-32 
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NOV20: 251 CCRYPLTVDFBAFGWD-WIIAPKRYKANYCSGQCEYMFMQKyPHTH LVQQANPR 303 

I h I 111+ III mill I I II + 1+ II ^1 

Sbjct: 1 CRRHDLYVDFKDLGWDDWIIAPKGYNAyyCSGECPFPLSERIJmTNHAIVQS 60 
NOV20: 304 GSAGPCCTPTKMSPINMLYFNDKQQIIYGKIPt^tVVDRCXSCS 345 (SEQ ID NO:299X) 

III ||| + ||++|||^+| Mlh III 

Sbjct: 61 AVPKPCCVPTKLSPI.SMLYYDDDGNWLRNyPNMWEECGCR 102 (SEQ ID NO: 3 00) 



gnl|Pfam|pfam00019, TGF-beta, Transforming growth factor beta like domain. 

CD-Length = 105 residues, 97,1% aligned 
Score = 103 bits (256) , Expect = 2e-23 
NOV20: 251 CCRYPLTVDFBAFGW-DWIIAPKRYKANYCSGQCEYMFMQKYPHTH LVQQAMPR 303 

I I III II milk I iitiii I ^ ih III 

Sbjct: 4 CRLRSLYVDFRDLGWGDWIIAPEGYIJUSnfCSGSCPFPLRDDIJ^LSNHAILQ'IXVRI.RNPR 63 
NOV20: 3 04 GSAGPCCTPTKMSPINMLYFNDKQQIIYGKIPGMWDRCGCS 345 (SEQ ID NO: 299) 

111 lllHk+lll +1 ^+ Ml 111 

Sbjct: 64 AVPQPCCVPTKLSPIiSMLYLDDNSNWLRLYPNMSVKECGCR 105 (SEQ ID NO: 3 00) 



gnl|Pfamlpfam00688, T6Fb_j?ropeptide, TGF-beta propeptide. This propeptide is 
known as latency associated peptide (LAP) in TGF-beta. LAP is a homodimer which 
is disulfide linked to TGF-beta binding protein. 

CD-Length = 227 residues, 46.3% aligned 

Score = 48.1 bits (113), Expect = 8e-07 
(SEQ ID NO: 3 02) 

NOV20: 62 CPVCVWRQHSRELRLESIKSQILSKLRLKEAPNISREWKQLLPKAPPLQQILDLHDFQG 121 

1 k ++ llhk llllll 1+ 1 l+l * 

Sbjct: 1 CRPLDLRRSQKQDRLEAIEGQILSKLGLRRRPRPSKE PMWPEYMLDLYNALS 53 

NOV20: 122 DALQ- -PEDFLEEDEYHATTETVISMAQ ETDPAVQTDGSPLCCHFHF 166 

+ + I +1 + + II 

Sbjct: 54 BLEEGKVGRVPEISDYDGREAGRANTIRSFSHLESDDFEESTPESHRKRFRF 105 

(SEQ ID NO: 3 03) 



The homology and domain information indicate that the sequence of the invention has 
properties similar to those of other proteins known to contain this/these domain(s) and similar 
to the properties of these domains. 

Transforming growth factor-beta (TGF-beta) is a multifunctional peptide that controls 
proliferation, differentiation and other functions in many cell types. TGF-beta-1 is a peptide 
of 1 12 amino acid residues derived by proteolytic cleavage from the C-terminal of a 
precursor protein. See IPROOl 839. 

A number of proteins are known to be related to TGF-beta- 1 . Proteins from the TGF- 
beta family are only active as homo- or heterodimer; the two chains being linked by a single 
disulfide bond. From X-ray studies of TGF-beta-2, it is known that all the other cysteines are 
involved in intrachain disulfide bonds. As shown in the following schematic representation, 
there are four disulfide bonds in the TGF-betas and in inhibin beta chains, while the other 
members of this family lack the first bond. 
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In'&erc3&ai.n 

+ ! + 

I > I 

xxxxcxxxxxCcxxxxxxxxxxxxxxxxxxCxxCxxxxxxxxxxxxxxxxxxxCCxxxxxxxxacxxxxxxxxxxCxCx 

i I I i II 
+ + +— I + I 

+ + 

■C*: conserved c^^'steixie involTOd in a disuXfide bond. 



The transforming growth factor beta, N-terminus (TGFb) domain is present in a 
variety of proteins which include the transforming growth factor beta, decapentaplegic 
proteins and bone morphogenetic proteins^ Transforming growth factor beta is a 
multifunctional peptide tiiat controls proliferation, differentiation and other functions in many 
cell types* The decapentaplegic protein acts as an extracellular morphogen responsible for 
the proper development of the embryonic dorsal hypoderm, for viability of larvae and for cell 
viability of the epithelial cells in the imaginal disks. Bone morphogenetic protein induces 
cartilage and bone formation and may be responsible for epithelial osteogenesis in some 
organisms. SeelPROOllll. 

The bones that comprise the axial skeleton have distinct morphologic features 
characteristic of their positions along the anterior/posterior axis. McPherron et al. (1 997) 
described a novel mouse TGF-beta family member, myostatin, encoded by the gene Mstn 
(601788), that has an essential role in regulating skeletal muscle mass. By low-stringency 
screening, McPherron et al (1997) also identified a gene related to Mstn. The cloning of this 
gene, designated Gdfl 1 (also called Bmpl 1), was also reported by Gamer et ah (1 999) and 
Nakashima et al (1999). McPherron et al (1999) showed that Gdfl 1, a transforming growth 
factor-beta (TGF-beta) superfamily member, has an important role in establishing the 
patterning of the axial skeleton. They found that during early mouse embryogenesis Gdfl 1 is 
expressed in the primitive streak and tail bud regions, which are sites where new mesodermal 
cells are generated. Homozygous mutant mice carrying a targeted deletion of Gdfl 1 exhibited 
anteriorly directed homeotic transformations throughout the axial skeleton and posterior 
displacement of the hindlimbs. The effect of the mutation was dose dependent, as Gdfl 1 +/- 
mice had a milder phenotype than Gdfl 1 -/- mice. Mutant embryos showed alterations in 
patterns of Hox (see 142950) gene expression, suggesting that Gdfl 1 acts upstream of the 
Hox genes. McPherron et al (1999) interpreted their findings to indicate that Gdfl 1 is a 
secreted signal that acts globally to specify positional identity along the anterior/posterior 
axis. To their knowledge, Gdfl 1 was the first secreted protein to be discovered that functions 
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globally to regulate anterior/posterior axial patterning. The homeotic transformations 
observed in Gdfl 1 mutant mice were more extensive than those seen either by genetic 
manipulation of presumed patterning genes or by administration of retinoic acid. The 
question was raised of whether Gdfl 1 and retinoic acid interact to regulate Hox gene 
expression and anterior/posterior patterning and whether Gdfl 1 regulates the patterning of 
tissues other than those studied by McPherron et al. (1999). 

The protein similarity information, expression pattern, cellular localization, and map 
location for the NOV20 protein and nucleic acid disclosed herein suggest that this Bone 
Morphogenetic Protein 1 1 -like protein may have important structural and/or physiological 
functions characteristic of the TGF-beta family. Therefore, the nucleic acids and proteins of 
the invention are useful in potential diagnostic and therapeutic applications and as a research 
tool. These include serving as a specific or selective nucleic acid or protein diagnostic and/or 
prognostic marker, wherein the presence or amount of the nucleic acid or the protein are to be 
assessed. These also include potential therapeutic applications such as the following: (i) a 
protein therapeutic, (ii) a small molecule drug target, (iii) an antibody target (therapeutic, 
diagnostic, drug targeting/cytotoxic antibody), (iv) a nucleic acid useful in gene therapy (gene 
delivery/gene ablation), (v) an agent promoting tissue regeneration in vitro and in vivo, and 
(vi) a biological defense weapon. 

The NOV20 nucleic acids and proteins of the invention have applications in the 
diagnosis and/or treatment of various diseases and disorders. For example, the compositions 
of the present invention will have efficacy for the treatment of patients suffering from: 
muscle wasting disease, a neuromuscular disorder, muscle atrophy, obesity or other adipocyte 
cell disorders, and aging as well as other diseases, disorders and conditions. 

These materials are further useful in the generation of antibodies that bind 
immunospecifically to the novel substances of the invention for use in therapeutic or 
diagnostic methods. These antibodies may be generated according to methods known in the 
art, using prediction from hydrophobicity charts, as described in the "Anti-NOVX 
Antibodies" section below. The disclosed NOV20 protein has multiple hydrophilic regions, 
each of which can be used as an immunogen. In one embodiment, a contemplated NOV20 
epitope is from about amino acids 55 to 57. In another embodiment, a contemplated NOV20 
epitope is from about amino acids 60 to 62. In other specific embodiments, contemplated 
NOV20 epitopes are from about amino acids 67 to 70, 90 to 99, 1 10 to 1 12, 1 15 to 1 17, 130 
to 145, 148 to 149, 150 to 152, 158 to 161, 180 to 200, 230 to 250, 260 to 310 and 320 to 
325. 
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NOV21 

One NOVX protein of the invention, referred to herein as NOV21, includes three 
Adrenomedullin Receptor-like proteins. The disclosed proteins have been named NOV21a, 
NOV21bandNOV21c. 



NOV21a 

A disclosed NOV21a (designated CuraGen Acc. No. CG56477-01), which encodes a 
novel Adrenomedullin Receptor-like protein and includes the 1341 nucleotide sequence (SEQ 
ID NO:63) is shown in Table 21 A. An open reading frame for the mature protein was 
identified beginning with an ATG initiation codon at nucleotides 51-53 and ending with a 
TGA stop codon at nucleotides 1413-1415. 



Table 21A. NOV21a Nucleotide Sequence (SEQ ID NO:63) 

CAGCCTCCTCACAGCrCCCCATAGCCTGGACCTGCCGGCCCTCCCTCCAGGACCGAGGGGCTCCCAAGGGAAAC 

TCAGGCGTGTGCTGGTCCCAATGTCAGTGAAACCCAGCTGGGGGCCTGGCCCCTCGGAGGGGGTCACCGCAGTG 

CCTACCAGT6ACCTTGGAGAGATCCACAACTGGACCGAGCTGCTTGACCTCTTCAACCACACTTTGTCTGAGTG 

CCACGTGGAGCTCAGCCAGAGCACCAAGCGCGTGGTCCTCTTTGCCCTCTACCTGGCCATGTTTGTGGTTGGGC 

TGGTGGAGAACCTCCTGGTGATATGCGTCAACTGGCGCGGCTCAGGCCGGGCAGGGCTGATGAACCTCTACATC 

CTCAACATGGCCATCGCGGACCTGGGCATTGTCCTGTCTCTGCCCGTGTGGATGCTGGAGGTCACGCTGGACTA 

CACCTGGCTCTGGGGCAGCTTCTCCTGCCGCTTCACTCACTACTTCTACTTTGTCAACATGTATAGCAGCATCT 

TCTTCCTGGTGTGCCTCAGTGTCGACCGCTATGTCACCCTCACCAGCGCCTCCCCCTCCTGGCAGCGTTACCAG 

CACCGAGTGCGGCGGGCCATGTGTGCAGGCATCTGGGTCCTCTCGGCCATCATCCCGCTGCCTGAGGTGGTCCA 

CATCCAGCTGGTGGAGGGCCCTGAGCCCATGTGCCTCTTCATGGCACCTTTTGAAACGTACAGCACCTGGGCCC 

TGGCGGTGGCCCTGTCCACCACCATCCTGGGCTTCCTGCTGCCCTTCCCTCTCATCACAGTCTTCAATGTGCTG 

ACAGCCTGCCGGCTGCGGCAGCCAGGACAACCCAAGAGCCGGCGCCACTGCTTGCTGCTGTGCGCCTACGTGGC 

CGTCTTTGTCATGTGCTGGCTGCCCTATCATGTGACCCTGCTGCTGCTCACACTGCATGGGACCCACATCrC^ 

TCCACTGCCACCTGGTCCACCTGCTCTACTTCTTCTATGATGTCATTGACTGCTTCTCCATGCTGCACTGTGTC 

ATCAACCCCATCCTTTACAACTTTCTCAGCCCACACTTCCGGG6CCGGCTCCTGAATGCTGTAGTCCATTACCT 

TCCTAAGGACCAGACCAAGGCGGGCACATGCGCCTCCTCTTCCTCCTCTTCCACCCAGCATTCCOTCA 

CCAAGGGTGATAGCCAGCCTGCTGCAGCAGCCCCCCACCCTGAGCCAAGCCTGAGCTTTCAGGCAC^ 

CTTCCAAATACTTCCCCCATCTCTCCCACTCAGCCTCrTACACCCAGCTGAGGTACTAGAACT 

GAATTCTAG 



The NOV21 polypeptide (SEQ ID NO:64) is 404 amino acid residues in length and is 
presented using the one-letter amino acid code in Table 21B. 



Table 21B. Encoded NOV21a Protein Sequence (SEQ ID NO:64) 

Mi^nSsWGPGPSiGVTAV^ 

VmWlGSGRAGLimLYILNiyiAIADLGIVLSLPVWMLEVTL^ 

VTIiTSASPSWQRYQHRVRIU^C3V6IWVI*SAIIPLPEVVHIQLVEGPEPMCLFMAPF 

IiPFPLITVITm,TACRLRQPGQPKSRimCLIJiCAYVAV^ 

IDCFSMIJICVINPILYNFLSPHFRGRIiLNAVVHYLPKIXSTKAG^ 

LSFQAHHLLPNTSPISPTQPLTPS 



Possible small nucleotide polymorphisms (SNPs) found for NOV21 are listed in 
Table 21C 
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Table 21C: SNPs 


Variant 


Nucleotide 
Position 


Base Change 


Amino Acid 
Position 


Base Change 


13377037 


363 


'I>C 


90 


Leu>Pro 


13377038 


604 


G>A 


170 


Arg>Arg 


13377039 


685 


OT 


197 


Gly>Gly 


13377040 


1139 


1>C 


349 


Cys>Arg 



NOV21b 

A disclosed NOV21b (designated CuraGen Acc. No. CG56477-02), which encodes a 
novel Adrenomedullin Receptor-like protein and includes the 945 nucleotide sequence (SEQ 
ID NO:65) is shown in Table 21b. An open reading frame for the mature protein was 
identified beginning with an ATG initiation codon at nucleotides 1-3 and ending with a TGA 
stop codon at nucleotides 943-945. The start and stop codons are in bold letters in Table 
21D. 



Table 21D. NOV21b Nucleotide Sequence (SEQ ID NO:65) 

ATGTCAGTGAAACCCAGCTGGGGGCCTGGCCCCTCGGAGGGGGTCACCGCAGTGCCTACCAGTGACCTTGGAGA 

GATCCACAACTGGACCGAGCTGCTTGACCACCTCTTCAACCACACTTTGTCTGAGTGCCACGTGGAGCTCAGCC 

AGAGCACCAAGCGCGTGGTCCTCTTTGCCCTCTACCTGGCCATGTTTGTGGTTGGGCTGGTGGAGAACCTCCTG 

GTGATATGCGTCAACTGGCGCGGCTCAGGCCGGGCAGGGCTGATGAACCTCTACATCCTCAACATGGCCATCGC 

GGACCTGGGCM?TGTCCTGTCTCTGCCCGTGTGGATGCCGGAGGTCACGCTGGACTACACCTGGCTCTGGGGCA 

GCTTCTCCTGCCGCTTCACTCACTACTTCTACTTTGTCAACATGTATAGCAGCATCTTCTTCCTGGTGTGCCTC 

AGTGTCGACCGCTATGTCACCCTCACAGGACAACCCAAGAGCCGGCGCCACTGCCTGCTGCTGTGCGCCTACGT 

GGCCGTCTTTGTCATGTGCTGGCTGCCCTATCATGTGACCCTGCTGCTGCTCACACTGCATGGGACCCACATCT 

CCCTCCACTGCCACCTGGTCCACCTGCTCTACTTCTTCTATGATGTCATTGACTGCTTCTCCATGCTGCACTGT 

GTCATCAACCCCATCCTTTACAACTTTCTCAGCCCACACTTCCGGGGCCGGCTCCTGAATGCTGTAGTCCATTA 

CCTTCCTAAGGACCAGACCAAGGCGGGCACATGCGCCTCCTCTTCCTCCTGTTCCACCCAGCATTCCATCATCA 

TCACCAAGGGTGATAGCCAGCCTGCTGCAGCAGCAGCCCCCCACCCTGAGCCAAGCCTGAGCTTTC^ 

Ca^TTTGCTTCCAAATACTTCCCCCT^TCTCTCCCACTCAGCCTCT^ 

The disclosed NOV21b nucleic acid sequence maps to chromosome 12 and has 473 of 
476 bases (99%) identical to a gb:GENBANK-ID:AR012140!acc:AR012140.1 mRNA from 
Unknown (Sequence 1 from patent US 5763218) (E =33e'^^^). 

A disclosed NOV21b polypeptide (SEQ IDNO:66) is 314 amino acid residues in 
length and is presented using the one-letter amino acid code in Table 2 IE. The SignalP, Psort 
and/or Hydropathy results predict that NOV21b has a signal peptide and is likely to be 
localized to the plasma membrane with a certainty of 0.6000. In alternative embodiments, a 
NOV21b polypeptide is located to the Golgi body with a certainty of 0.4000, the endoplasmic 
reticulum (membrane) with a certainty of 0.3000 or the mitochondrial inner membrane with a 
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certainty of 0.0300. The Signal? predicts a likely cleavage site for a NOV37 peptide between 
amino acid positions 17 and 1 8, i.e. at the sequence VTA-VP. 



Table 21E. Encoded NOV21b Protein Sequence (SEQ ID NO:66) 

MSVKPSWGPGPSEGVTAVPTSDLGEIHNWTELLDHLFNHTLSECHVELSQSTKRVVLFALYIjAMPW 
GLVENLLVICVNWRGSGRAGLMNLYILNmiADLGIVLSLPVWMPEVTLDYTWLWGSFSOT 
FVIOMYSS I FFLVCLSVDRYVTLTGQPKSRRHCLLLCAYVAVFVMCWLPYHVTLLLLTL^ 
HLVHLLYFFYDVIDCFSMLHCVINPILYNFLSPHFRGRLLNAWHYLPKDQTKAGTCASSSSCSTQH 
SIXITKGDSQPAAAAAPHPEPSLSFQAHHLLPNTSPISPTQPLTPS 



The NOV21b amino acid sequence was found to have 1 56 of 157 amino acid residues 
(99%) identical to, and 156 of 157 amino acid residues (99%) similar to, the 404 amino acid 
residue ptnr:SWISSNEW-ACC:01521 8 protein from Homo sapiens (Human) 
(ADRENOMEDULLIN RECEPTOR (AM-R)) (E = 1.4e''^*). 

NOV21b is expressed in at least the following tissues: heart, skeletal muscle, liver, 
pancreas, stomach, spleen, lymph node, bone marrow, adrenal gland, and thyroid. 
Expression information was derived from the tissue sources of the sequences that were 
included in the derivation of the sequence of NO V2 lb. 

NOV21C 

A disclosed NOV21c (designated CuraGen Acc. No. CG56477-03), which encodes a 
novel AdrenomeduUin Receptor-like protein and includes the 965 nucleotide sequence (SEQ 
ID NO:67) is shown in Table 2 IF. An open reading frame for the mature protein was 
identified beginning with an ATG initiation codon at nucleotides 3-5 and ending with a TGA 
stop codon at nucleotides 963-965. Putative untranslated regions are underlined in Table 
2 IF, and the start and stop codons are in bold letters. 



Table 21F. NOV21c Nucleotide Sequence (SEQ ID NO:67) 

CGATGTCAGTGAAACCCAGCTGGGGGCCTGGCCCCTCGGAQGGGGTCACCGCAQTQCCTAC 
GATCCACAACTGGACCGAGCTGOTTGACCTCTTCAACCACACTTTGTCTGA 
ACCAAGCGCGTGGTCCrrCTTTG<X:CTCTACCTGGCCATGTTTGTGGTTG<^CTGGTGG 
GCGTCAACTGGCGCGGCTCAGGCCGGGCAGGGCTGATGAACCTCTACATCCrCAAC^^ 

CATTCTCCTGTCTCTGCCCGTGTGGATGCTGGAGGTCACGCTGGACTACACCTGGCTCTGGGGCAGCTTCTCCr 

CGCTTCACTCACTACTTCTACTTTGTCAACATGTATAGCAGCATCTTCTTCCTGCrrGCCCTTCCCTCTCATC^ 

TCITCAATGTGCTGACAGCCTGCCGGCTGCGGCAGCCAGGACAACCCAAGAGCCGGC^ 

CGCCTACGTGGCCGTCTTTGTCATGTGCTGGCTGCCCTATCATGTGACCCTGCTGCTGCTCACACTGCATGGGACC 

CACATCTCCCTCCACTGCCACCTGGTCCT^CCTGCrCTACTTCTTCTATGATGTCATTGACTGCTTCTCCATGCTGC 

ACTGTGTCATCAACCCCATCCTTTACAACTTTCTCAGCCCACACTTCCGGGGCCGGCTCCTGAATGCTGTAGTCCA 

TTACCTTCCTAAGGACCAGACCAAGGGCGGGCACATGCGCCTCCTCTTCCTCCTGTTCCACCCAGCATTCCATCAT 

CATCACa^GGTGATAGCCAGCCTGCTGCAGCAGCCCCCCACCCTGAGCCAAGCCrGAGCTTTa^GGC^ 

TGCTTCCAAATACTTCCCCCATCTCTCCCACTCAGCCrCTTACACCCAGCTGA 
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The disclosed NOV21c nucleic acid sequence maps to chromosome 12 and has 549 of 
559 bases (98%) identical to a gb:GENBANK-ID:AR012140|acc:AR012140.1 mRNA from 
Unknown. (Sequence 1 from patent US 5763218) (E = 93e 

A disclosed NOV21c polypeptide (SEQ ID NO:58) is 320 amino acid residues in 
length and is presented using the one-letter amino acid code in Table 21G* The SignalP, 
Psort and/or Hydropathy results predict that NOV21c has a signal peptide and is likely to be 
localized to the plasma membrane with a certainty of 0.6000. In alternative embodiments, a 
NOV21c polypeptide is located to the Golgi body with a certainty of 0.4000, the endoplasmic 
reticulum (membrane) with a certainty of 0.3000, or the mitochondrial inner membrane with 
a certainty of 0300. The SignalP predicts a likely cleavage site for a NOV21c peptide 
between amino acid positions 14 and 15, Le. at the sequence SEG-VT. 



Table 21G. Encoded NOV21c Protein Sequence (SEQ ID NO:58) 

CVNWRGSGRAGLMNLYILNmiADLGIVLSLPVl^LEV^ 

WFNVLTACRLRQPGQPKSI^CLLLCAYVAVFVMCWIiPYHVTLLLLTLHGTHISLHai^ 
l^HCVINPILYNFLSPHFRGRLIJJAVVHYIiPKDQTKGCaiMRLLFLLFHPAFI^^ 

AHHLLPNTSPI SPTQPIiTPS 

The NOV21c amino acid sequence was found to have 159 of 178 amino acid residues 
(89%) identical to, and 1 60 of 1 78 amino acid residues (89%) similar to, the 404 amino acid 
residue ptnr:SWISSNEW-ACC:015218 protein from Homo sapiens (Human) 
(ADRENOMEDULLIN RECEPTOR (AM-R)) (E - 7.1e-^^). 

NOV21c is expressed in at least the following tissues: heart, skeletal muscle, liver, 
pancreas, stomach, spleen, lymph node, bone marrow, adrenal gland, and thyroid. 
Expression information was derived from the tissue sources of the sequences that were 
included in the derivation of the sequence of NOV21c. 

Homologies to any of the above NOV21a, NOV21b and NOV21c proteins will be 
shared by the other NOV2 1 proteins insofar as they are homologous to each other as shown 
above. Any reference to NOV21 is assumed to refer to NOV21a, NOV21b and NOV21c 
proteins in general, unless otherwise noted. 

NOV21a, NOV21b and NOV21c are very closely homologous as is shown in the 
amino acid alignment in Table 21H. 

Table 21H. ClustalW of NOV21a, NOV21b and NOV21c 
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10 20 30 40 50 

60 70 80 90 100 

NOV2ia S^^^^^a^^jSj^^m^^^^^^^^^^^^^^BB^^^ 99 

NOV2Xb ^^^^^^^^^^^^^^^^^^^^^^g^g^^Se^^Q^ 100 

110 120 130 140 150 

NOV21a [fek^JUMAjJ^^ 148 
NOV21b ^^^^^^Rp^^^^^g|^^^^S3|S^^ 149 

160 170 180 190 200 
N0V21a I^^CTpASPSWQRYQHRVRRAMCAGIWVLSAIIPLPBWHIQLVEGP 198 
NOV21C pVpIvUcRLR 162 

210 220 230 240 250 

|....|....|....|....|....|....|....t...J^ 

NOV2 la EPMCLFMAPFETYSTWALAVALSTTILGFLLPFPLITVFNVIiTACRLRgg 24 8 

NOV21b 157 

NOV21C 

260 270 280 290 300 

NOV21a EeMj^i^y^iUAM^ 298 
NOV2Xb ^^^^^^^^^^^^^^^^^^^^^Q^^^^g^^SSS 207 

NOV21C B8S^^ H B BKB BBBBfflS8 B ^HBBHEBBBBBWBBffi3W 214 

310 320 330 340 350 

[....[ I 

HOV21a ^^abUjA^i^jngO^ 348 
NOV21b ^^^^^^^^^^^^^^^^^^^^^^^^mSSB^^^S^ 257 

KOV21C BHBWBWBHSBBBBH^HM^BSH lMl^BBHMMB^m 264 

360 370 380 390 400 

NOV21a ^g^^ggj^^^^^g^^-^^^^^^^^^^^^^^^g 397 

Nov2ib BB999SBBSI8^B^ ^^Ma^^^^^^^^^^^^^^^^^^ 307 

N0V21C MT?Tj.FTj.FHPAPmfflpS8ffl83--BIBB3!BSBKBSSiSBBs^BBi 313 



N0V21a 
N0V21b 
NOV21C 



TQPLTPS 
TQPLTPS 



404 
314 
320 



NOV21a also has homology to the amino acid sequences shown in the BLASTP data 
in Table 2 IL 
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Table 211. BLAST results for NOV21a 


Gene Index/ 
Identifier 


Protein/ Organism 


Length 
(aa) 


Identity 
(%) 


Positives 
{%) 


Expect 


gi|6005705|ref |N 
P_009195.l| 
(NM_007264) 


adrenomedullin 

receptor; G- 
protein^ coupled 
receptor similar 
to the 
adr enomedul 1 in 
receptor [Homo 
sapiens] 


404 


404/404 
(100%) 


404/404 
(100%) 


0.0 


gi|6680654 |ref |N 
P_031438.1 j 
(NM 007412) 


adr enomedul 1 in 
receptor [Mus 
musculus] 


395 


278/376 
(73%) 


317/376 
(83%) 


e-148 


gi 1 16757998 |ref| 
NP_44 5754.1 1 
(NM 053302) 


adrenomedullin 
receptor [Rattus 
norvegicus] 


398 


287/384 
(72%) 


327/394 
(82%) 


e-145 


gi|543446|pir| |S 
40685 


probable G 
protein- coupled 
receptor GlOd - 
rat 


395 


285/381 
(74%) 


324/381 
(84%) 


e-143 


gi [12643 978 (sp|P 
31392 |AI»1R_RAT 


ADRENOMEDUIiIilN 
RECEPTOR (AM-R) 
(GIOD) (NOW) 


395 


282/380 
(74%) 


321/380 
(84%) 


e-142 



The homology of these sequences is shown graphically in the ClustalW analysis 
shown in Table 21 J. 



Table 21 J. ClustalW Analysis of NOV21 



1) NOV21a 

2) N0V21b 

3) NOV21C 

4) gi|6005705 

5) gii6680654 

6) gi|l6757998 

7) gil543446 

8) gi I 12643978 



(SEQ ZD NO: 64) 
(SEQ ID NO: 66) 
(SEQ ID NO: 68) 
(SEQ ID NO: 3 04) 
(SEQ ID NO:305) 
(SEQ ID NO: 306) 
(SEQ ID NO: 307) 
(SEQ ID NO:308) 



NOV21a 
NOV21b 
N0V21C 
gi 1 6005705 | 
gi I 6680654 I 
gi 1 16757998 I 
gi I 543446 I 
gi 112643978 I 



NOV21a 
N0V21b 
NOV21C 
gi I 6005705 I 
gi I 6680654 I 
gi 1 16757998 I 
gi I 543446 I 
gi 1 12643978 I 




130 



140 



150 



160 



170 



180 
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N0V21a 


120 


NOV21b 


121 


NOV21C 


120 


gi| 6005705 1 


120 


gi| 6680654 1 


116 


gij 167579981 


116 


gi| 5434461 


116 


gi| 126439781 


116 



N0V21a 


179 


NOV21b 


157 


NOV21C 


157 


gi| 6005705 | 


179 


gij 66806541 


175 


gi|l6757998| 


175 


gij 543446 1 


175 


gij 12643978) 


175 


NOV21a 


239 


NOV21b 


157 


NOV21C 


157 


gi| 60057051 


239 


gi| 6680654 1 


235 


gi| 16757998 1 


235 


gi 15434461 


235 


gij 12643978] 


235 


NOV21a 


299 


NOV21b 


208 


NOV21C 


215 


gi| 6005705) 


299 


gl 1 6680654 j 


295 


gij 16757998 1 


295 


gi 1 543446 | 


295 


gij 12643978 1 


295 


NOV21a 


359 


NOV21b 


268 


NOV21C 


275 


gi| 60057051 


359 


gij 6680654 1 


355 


gijl6757998| 


355 


gi 1 543446 | 


355 


gij 126439781 


355 




,TPS 404 
TPS 314 
iQgLTPS 320 
TPS 404 



1PN|S 

|s: 
|s: 
|s: 

_2T-PIffiHSAIL 

LIA^^LHTHAIRWOj^SLPPNTlP'llcNilAS- - 
LQR- ICTgr^EllcpPPLCgRT-VldlHSAlg 

LQR- iCTBrigi J<AplcHrt- p JhsaiI- - 



395 
398 
395 
395 



Tables 21Kand 21 L list the domain description from DOMAIN analysis results 
against NOV21 . ITiis indicates that the NOV21 sequence has properties similar to those of 
other proteins known to contain these domains. 



Table 21K Domain Analysis of NOV21c 

hmmpfatn - search a single seq against HMM database 
HMM file: pfatnHMMs 

Scores for sequence family classification (score includes all domains) r 
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Model Description Score E-value N 

7tm_l 7 transmembrane receptor (rbodopsin family) 157.3 8e-49 2 

Parsed for domains : 

Model Domain seq seq hmm hram score E-value 

from to from to 



7tm__l 1/2 70 142 .. 1 75 [ . 74.6 8.1e-23 

7tm_l 2/2 143 236 .. 173 259 .] 86.7 1.3e-26 

Alignments of top-scoring domains: 

7tm__l: domain l of 2, from 70 to 142: score 74.6, E = 8.1e-23 

* - >(3NlI.Vilvilrtkklr . tptnif ilNIiAvADLLf lit Ippwalyylv 

NOV21C 70 ENLLVICVNWR-GSGRaGLMNLYILNMAIADLGIVLSLPVWMI^^ 115 

ggsedWpfGsalCklvtaldvvnmyaSil<-* (SEQ ID NO:3 09X) 

++)++[ 1+ I ++++++++ I [ I 1 + I I + 
NOV21C D--YTWLWGSFSCRFTHyFYFVNMYSSIF 142 (SEQ ID NO:310) 



7tm_l: domain 2 of 2, from 143 to 236: score 86.7, E = 1.3e-26 

* - >F1 IP 1 1 vi IvcYt r 1 1 r 1 1 r kaakt 1 IvvwvFvl CWlP 

II 11+ +1 + I++ +++++ 1 I +++++++++ + +t+++| III Mil 

NOV21C 143 FLLPFPLITVFNVLTACRLRqpgqpksrRHCXLLCAYVAVFVMCWLP 189 

yf ivllldtlc . lsiimsstCelervlptallvtlwLayvNsclNPiIY< (SEQ ID 

NO:311) 

I+++III 11++++! I++I I ++I ++++I+ +++++++++ 1 1 1 + 1 

NOV21C YHVTIiLLLTLHgTHI--SI*HCHr*VHLLYFFYDVIDCFSMLHCVINPILY 236 {SEQ ID 
NO:312) 



Table 21L Domain Analysis of NOV21a 

gnl jPfam|pf amOOOOl, 7tm__l, 7 transmembrane receptor (rhodopsin family). 
C!D-I.ength - 254 residues, 100.0% aligned 
Score - 147 bits (371) , Expect = le-36 

KOV21:70 ENLLVICVNWRGSGRAGLMNXiYILNMAIADLGIVLSIiPVWMLEV^ 129 

mil I I |^.,^||+|..||| I I + t^^i 1+ 

Sb j ct : 1 <aiIiI*VILVIIJRTKiajRTPTNIFIiNIAVADLLFIJ:.TLPPW 60 

NOV21:130 HYFYFVNMYSSIFFLVCLSVDRYVTLTSASPSWORYQHRVRRAMCAGIWVLSAIIPLP 189 

+ II l + ll I +I + III+ + ^ I H- -H H1I+ ++ II + 

Sbjct : 61 6ALFVVNGyASIIJJ:.TAISIDRYI*AIVHPLRYRRIRTPRRAKVLILLVWVIAIJi 120 

NOV21:190 VHIQLVEGPEPMCO^FMAPFETYSTWAIJVVALSTTILGFLLPFPLITVFNVLTAOT 249 

+ II + + I +I++Ikll +1 1 11 + 

Sbjct:121 LFSWLRTVEEGNTTVCLIDFPEESVKRSYVLLSTI*VGFVIiPLLVTLVCYTRILRTLRKRA 180 

NOV21:250 QP KSRRHCIirXCAYVAVFVMCWLPYHVTIJJIJJTLHGTHISLHCHLVHLLYF 300 

1+ +1 I lll+ll I + + +1 

Sbjct: 181 RSQRSLKRRSSSERKAAKMLLWVWFVLCW LPYHIVLLLDSLCLLSIWRVLPT 234 

NOV21:301 FYDVIDCFSMLHCVINPILY 320 (SEQ ID NO:313) 

+ + ++ +III+I 
Sbjctt235 ALLITLWLAYVNSCIaNPIIY 254 (SEQ ID NO:314) 



The rhodopsin-15ke GPCRs themselves represent a widespread protein family that 
includes hormone, neurotransmitter and light receptors, all of which transduce extracellular 
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signals through interaction with guanme nucleotide-binding (G) proteins. Although their 
activating ligands vary widely in structure and character, the amino acid sequences of tfie 
receptors are very similar and are believed to adopt a common structural framework 
comprising 7 transmembrane (TM) helices. See InterPro IPR000276. 

G-protein-coupled receptors (GPCRs) constitute a vast protein family that 
encompasses a wide range of functions (including various autocrine, paracrine and endocrine 
processes). They show considerable diversity at the sequence level, on the basis of which 
they can be separated into distinct groups. The term clan is used to describe the GPCRs, as 
they embrace a group of families for which there are indications of evolutionary relationship, 
but between which there is no statistically significant similarity in sequence. The currently 
known clan members include the rhodopsin-like GPCRs, the secretin-like GPCRs, the cAMP 
receptors, the fungal mating pheromone receptors, and the metabotropic glutamate receptor 
family. 

Adrenomedullin (AM, or ADM; 103275) is a 52-amino acid peptide involved in 
vasodilation and body fluid homeostasis. By PCR on human genomic DNA using primers 
based on the rat ADM receptor (Admr), Hanze et al (1 997) isolated a cDNA encoding 
human ADMR, which they called AMR. Sequence analysis predicted that the 404-amino 
acid, 7-transmembrane ADMR protein, which is 73% identical to the rat ADM receptor, 
contains 2 potential N-terminal N-Hnked glycosylation sites and several potential ser and thr 
C-terminal cytoplasmic phosphorylation sites. Northern blot analysis detected highest 
expression of a major 1 .8-kb ADMR transcript in heart, skeletal muscle, liver, pancreas, 
stomach, spleen, lymph node, bone marrow, adrenal gland, and thyroid, with lower 
expression in brain, lung, placenta, small intestine, thymus, and leukocytes. Southern blot 
analysis indicated that ADMR is a single-copy gene. See Hanze, et aL^ Biochem. Biophys. 
Res. Commun. 240: 183-188, 1997, PubMed ID : 9367907. 

The protein similarity information, expression pattern, cellular localization, and map 

location for the NOV21 protein and nucleic acid disclosed herein suggest that this 

Adrenomedullin Receptor-like protein may have important structural and/or physiological 

functions characteristic of the Adrenomedullin Receptor family. Therefore, the nucleic acids 

and proteins of the invention are useful in potential diagnostic and therapeutic applications 

and as a research tooL These include serving as a specific or selective nucleic acid or protein 

diagnostic and/or prognostic maricer, wherein the presence or amount of the nucleic acid or 

the protein are to be assessed. These also include potential therapeutic applications such as 

the following: (i) a protein therapeutic, (ii) a small molecule drug target, (iii) an antibody 
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target (therapeutic, diagnostic, drug targeting/cytotoxic antibody), (iv) a nucleic acid useful in 
gene therapy (gene delivery/gene ablation), (v) an agent promoting tissue regeneration in 
vitro and in vivo, and (vi) a biological defense weapon. 

The NOV21 nucleic acids and proteins of the invention have applications in the 
diagnosis and/or treatment of various diseases and disorders. For example, the compositions 
of the present invention will have efficacy for the treatment of patients suffering from: 
develq>mental diseases, MHCII and III diseases (immune diseases). Taste and scent 
detectability Disorders, Buricitt's lymphoma, Corticoneurogenic disease. Signal Transduction 
pathway disorders. Retinal diseases including those involving photoreception. Cell Growth 
rate disorders; Cell Shape disorders. Feeding disorders; control of feeding; potential obesity 
due to over-eating; potential disorders due to starvation (lack of appetite), non-insulin- 
dependent diabetes mellitus (NIDDMl), bacterial, fungal, protozoal and viral infections 
(particularly infections caused by HIV-1 or HIV-2), pain, cancer (including but not limited to 
Neoplasm; adenocarcinoma; lymphoma; prostate cancer; uterus cancer), anorexia, bulimia, 
asthma, Parkinson's disease, acute heart failure, hypotension, hypertension, urinary retention, 
osteoporosis, Crohn's disease; multiple sclerosis; and Treatment of Albright Hereditary 
Ostoeodystrophy, angina pectoris, myocardial infarction, ulcers, asthma, allergies, benign 
prostatic hypertrophy, and psychotic and neurological disorders, including anxiety, 
schizophrenia, manic depression, delirium, dementia, severe mental retardation. 
Dentatorubro-pallidoluysian atrophy(DRPLA) Hypophosphatemic rickets, autosomal 
dominant (2) Acrocallosal syndrome and dyskinesias, such as Huntington's disease or Gilles 
de la Tourette syndrome and/or other pathologies and disorders of the like. The polypeptides 
can be used as immunogens to produce antibodies specific for the invention, and as vaccines. 
They can also be used to screen for potential agonist and antagonist compounds. For 
example, a cDNA encoding the adrenomedullin -like protein may be useful in gene therapy, 
and the adrenomedullin -like protein may be useful when administered to a subject in need 
thereof. By way of nonlimiting example, the compositions of the present invention will have 
efficacy for treatment of patients suffering from bacterial, fungal, protozoal and viral 
infections (particularly infections caused by HIV-1 or HIV-2), pain, cancer (including but not 
limited to Neoplasm; adenocarcinoma; lymphoma; prostate cancer; uterus cancer), anorexia, 
bulimia, asthma, Parkinson's disease, acute heart failure, hypotension, hypertension, urinary 
retention, osteoporosis, Crohn's disease; multiple sclerosis; and Treatment of Albright 
Hereditary Ostoeodystrophy, angina pectoris, myocardial infarction, ulcers, asthma, allergies, 
benign prostatic hypertrophy, and psychotic and neurological disorders, including anxiety, 
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schizophrenia, manic depression, delirium, dementia, severe mental retardation and 
dyskinesias, such as Huntington's disease or Gilles de la Tourette syndrome and/or other 
pathologies and disorders. The novel nucleic acid encoding adrenomedullin -like protein, and 
the adrenomedullin -like protein of the invention, or fragments thereof, may further be useful 
in diagnostic applications, wherein the presence or amount of the nucleic acid or the protein 
are to be assessed. These materials are further useful in the generation of antibodies that bind 
immunospecifically to the novel substances of the invention for use in therapeutic or 
diagnostic methods, cardiomyopathy, atherosclerosis, hypertension, congenital heart defects, 
aortic stenosis, atrial septal defect (ASD), atrioventricular (A-V) canal defect, ductus 
arteriosus, pulmonary stenosis, subaortic stenosis, ventricular septal defect (VSD), valve 
diseases, tuberous sclerosis, scleroderma, obesity, transplantation; Colon cancer. Colorectal 
cancer; Colorectal cancer; familial nonpolyposis, type 6; Esophageal cancer; 
Hepatoblastoma; Hypobetalipoproteinemia, familial, 2; Lung cancer; Metaphyseal 
chondrodysplasia, Murk Jansen type; Ovarian carcinoma, endometrioid type; Pilomatricoma; 
Pseudo-Zellweger syndrome as well as other diseases, disorders and conditions. 

These antibodies may be generated according to methods known in the art, using 
prediction from hydrophobicity charts, as described in the "Anti-NOVX Antibodies" section 
below. The disclosed NOV21 protein has multiple hydrophilic regions, each of which can be 
used as an immunogen. In one embodiment, a contemplated NOV21 epitope is from about 
amino acids 10 to 40. In another embodiment, a contemplated NOV21 epitope is from about 
amino acids 160 to 165. In other specific embodiments, contemplated NOV21 epitopes are 
from about amino acids 250 to 265, 270 to 280 and 300 to 320- 

NOV22 

One NOVX protein of the invention, referred to herein as NOV22, includes two 
Tyrosine Phosphatase-like proteins. The disclosed proteins have been named NOV22a, and 
NOV22b. 

NOV22a 

A disclosed NOV22a (designated CuraGen Acc. No. CG57256-01), which encodes a 
novel Protein Tyrosine Phosphatase-like protein and includes the 549 nucleotide sequence 
(SEQ ID NO:69) is shown in Table 22 A. An open reading frame for the mature protein was 
identified beginning with an ATG initiation codon at nucleotides 30-32 and ending with a 
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TAA stop codon at nucleotides 540-542. Putative untranslated regions are underlined in 
Table 22A, and the start and stop codons are in bold letters. 

Table 22A. NOV22a Nucleotide Sequence (SEQ ID NO:69) 

TATTTTTTA^CTAAATTi^TACACCTCGiA TGAACCACCCAGCTCCTGTQAAAGTCa^CaTACA^ 
TTTCCTATTACACACAATCCAACCAATGTGACCTTAAATAAATTTATAGAG^^ 
CACAATAGTAAGAGTATGTGAAGCaACTTATGACT^CTACTCTTGTGGAGAAAGAAOT 
GGCCTTTTGGTGATGGTGCa^CCACCATCCZ^CCAGATTGTTGCTGATTGGTTAC^ 

AGCATCAGTTGAAGGTGGAATGAAACATGAAGATGCAGTACAATTCATAGGACA^^ 
AAAGCAAGCAACITTTGTATTTGGAGAAGTATCATCCTAAAATGCGGCTGC^ 

ATAftACAACTGTTGCATTCAATAAAACTGGG 

The disclosed NOV22a nucleic acid sequence maps to chromosome 1 and has 505 of 
546 bases (92%) identical to a gb:GENBANK-ID:HSU48296|acc:U48296.1 mRNA from 
Homo sapiens (Homo sapiens protein tyrosine phosphatase PTPCAAXl (hPTPCAAXl) 
mRNA, complete cds) (E = 9.8e''^^). 

A disclosed NOV22a polypeptide (SEQ ID NO:70) is 170 amino acid residues in 
length and is presented using the one-letter amino acid code in Table 22B. The SignalP, 
Psort and/or Hydropathy results predict that NOV22a does not have a signal peptide and is 
likely to be localized to the endoplasmic reticulum (membrane) with a certainty of 0.8500. In 
alternative embodiments, a NOV22a polypeptide is located to the plasma membrane with a 
certainty of 0.4400, the mitochondrial inner membrane with a certainty of 0.1 000, or the 
Golgi body with a certainty of 0.1000. 



Table 22B. Encoded NOV22a Protein Sequence (SEQ ID NO:70) 

MtraPAPVKVTYKNMRFPITHNPTNVTIi^ 

IVADWIJHFVKIKFCEEPGCYIAVNCIVGLGKAPVI»VAIJ^VEG<3«KHEIJA^ 

MRLRFKDSNSHIMNCCIQ 

The NOV22a amino acid sequence was found to have 145 of 170 amino acid residues 
(85%) identical to, and 152 of 170 amino acid residues (89%) similar to, the 173 amino acid 
residue ptnr:SPTREMBL-ACC:O00648 protein from Homo sapiens (Human) (PROTEIN 
TYROSINE PHOSPHATASE PTPCAAXl) (E = 1.9e'^^. 

NOV22a is predicted to be expressed in the liver because of the expression pattern of 
(GENBANK-ID: gb:GENBANK4D:HSU48296|acc:U48296.1), a closely related Homo 
sapiens protein tyrosine phosphatase PTPCAAXl (hPTPCAAXl) mRNA, complete cds 
homolog in species Homo sapiens. 
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NOV22b 

A disclosed NOV22b (designated CuraGen Acc^ No. CG57256-02), which encodes a 
novel Protein Tyrosine Phosphatase-like protein and includes flie 850 nucleotide sequence 
(SEQ ID NO:71) is shown in Table 22C. An open reading frame for the mature protein was 
identified beginning with an ATG initiation codon at nucleotides 1-3 and ending with a TAG 
stop codon at nucleotides 529-531. Putative untranslated regions are underlined in Table 
22C, and the start and stop codons are in bold letters- 



Table 22C. NOV22b Nucleotide Sequence (SEQ ID NO:71) 

ATGAACCACCCAGCTCCTGTGATGAACCACCCAGCTCCTGTGAAAGTCACATACAAGAACATGAGATTTCCTATTAC 

ACACAATCCAACCAATGTGACCTTAAATAAATTTATAGAGGAGCTTAAGAAGTATGGAGCTACCACAATAGTAAGAG 

TATGTGAAGCAACTTATGACACTACTCTTGTGGAGAAAGAAGGTATCCATGTTCTCAATTGGCCTTTTGGTGATGGT 

GCACCACCATCCAACCAGATTGTTGCTGATTGGTTACATTTTGTAAAAATTAAGTTTTGTGAAGAACCTGGTTGTTA 

TATTGCTGTTAATTGCATTGTAGGCCTTGGGAAAGCTCCAGTACTTGTTGCCCTAGCATCAGTTGAAGGTGGAATGA 

AACATGAAGATGCAGTACAATTCATAGGACAAAAGCGGAGTGGAGCTTTTAAAAGCAAGCAACTTTTGTATTTGGAG 

AAGTATCATCCTAAAATGCGGCTGCGCTTCAAAGATTCCAATAGTGCTGCGCTTCaAAGATTCCAATA GTGCTGCGC 

TTCAAAGATTCCAATAGTGCTGCGCTTCAAAGATTCCAATAGTGCTGCGCTTCAAAGATTCCAATAGTGCTGCGCTT 

CAAAGATTCCAATAGTGCTGCGCTTCAAAGATTCCAATAGTGCTGCGCTTCAAAGATTCCAATAGTGCTGCGCTT 

AAGATTCCAATAGTGCTGCQCTTCAAAGATTCCAATAGTGCTGCGCTTCAAAGATTCCAATAGTGCTGCGCTTCA^ 

GATTCCAATAGTGCTGCGCTTCAAAGATTCCAATAGTGCTQCGOTTCAAAGATTCC^ 

TTC 

The disclosed NOV22b nucleic acid sequence maps to chromosome 6ql2 and has 452 
of 486 bases (93%) identical to a gb:GENBANK-ID:HSU482961acc:U48296.1 mRNA from 
Homo sapiens (Homo sapiens protein tyrosine phosphatase PTPCAAXl (hPTPCAAXl) 
mRNA, complete cds) (E = 2.8e'^^. 

A disclosed NOV22b polypeptide (SEQ ID NO:72) is 176 amino acid residues in 
length and is presented using the one-letter amino acid code in Table 22D. The SignalP, 
Psort and/or Hydropatfiy results predict that NOV22b does not have a signal peptide and is 
likely to be localized to the endoplasmic reticulum (membrane) with a certainty of 0.8500. In 
alternative embodiments, a NOV22b polypeptide is located to the plasma membrane with a 
certainty of 0.8500, the microbody (peroxisome) with a certainty of 0.4400, or the 
mitochondrial inner membrane with a certainty of 0.1000. 



Table 22D, Encoded NOV22b Protein Sequence (SEQ ID NO:72) 



MKHPAPVMNHPAPVKVTYia^n^FPITHNPTNVTIiNKFIEEL^ 

APPSNQIVADWI^HFVKIKFCEEPGCYIAVNCIVGLGKAPVLVALASVEGGMKHEDAVQFIGQ^ 
KYHPKMRLRFKDSNSAALQRFQ 



The NOV22b amino acid sequence was found to have 138 of 161 amino acid residues 
(85%) identical to, and 145 of 161 amino acid residues (90%) similar to, the 173 amino acid 
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residue ptnr:SPTREMBL-ACC:O00648 protein from Homo sapiens (Human) (PROTEIN 
TYROSINE PHOSPHATASE PTPCAAX1)(E = 8.2e"^^). 

NOV22b is expressed in at least the brain. Expression information was derived from 
the tissue sources of the sequences that were included in the derivation of the sequence of 
NOV22b. The sequence is also predicted to be expressed in the liver because of the 
expression pattern of (GENBANK-ID: gb:GENBANK-ID:HSU48296|acc:U48296.1), a 
closely related Homo sapiens protein tyrosine phosphatase PTPCAAXl (hPTPCAAXl) 
mRNA, complete cds homolog in species Homo sapiens. 

Homologies to any of the above NOV22a proteins will be shared by the other NOV22 
proteins insofar as they are homologous to each other as shown above. Any reference to 
NOV22 is assumed to refer to both of the NOV22 proteins in general, unless otherwise noted. 

NOV22a and NOV22b are very closely homologous as is shown in the amino acid 
alignment in Table 22E. 



Table 22E. ClustalW of NOV22a and NOV22b 



10 



20 
- I - - 



30 



40 



50 



NOV22a 

NOV22b MNHPAPV 



MNHPAPVKVTYKNMRFPITHNPTNVTLNKFIEELKKYGATTI 
MNHPAPVKVTYKNMRFPITHNPTNVTLNKFIEELKKYGATTIV 



RVCEATYDTTLVEKEGIKVLNWPFGDGAPPSNQIVADWLHFVKIKFCSEP 
RVCSATYDTTLVEKBGIHVLNWPFGDGAPPSNQIVADWLHFVKIKFCBEP 



NOV22a 
NOV22b 



NOV22a 
NOV22b 



NOV22a 
NOV22b 



110 



120 



120 



140 



150 



GCYIAVNCIVGLGKAPVLVALASVEGGMKHEDAVQFIGQKRSGAFKSKQL; 
GCYIAVNCIVGLGKAPVLVALASVEGGMKHBDAVQFIGQKRSGAFKSKQL 




NOV22a also has homology to the amino acid sequences shown in the BLASTP data 
listed in Table 22F. 
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Table 22F. BLAST results for NOV22a 


Gene Index/ 
Identifier 


Protein/ Organism 


Length 
(aa) 


Identity 
(%) 


Positives 
(%) 


Expect 


gi [4506283 |ref | 
NP 003454. l| 
(NM_003463) 


protein tyrosine 
phosphatase type 

IVA, member 1; 
Protein tyrosine 
phosphatase IVAl 
[Homo sapiens] 


173 


145/170 
(85%) 


152/170 
(89%) 


3e-83 


gi|l7528929|gb| 
AAIi3866l.l| 
(AY062269) 


protein tyrosine 
phosphatase 4al 
[Rattus 
norvegicus] 


173 


144/170 
(84%) 


151/170 
(88%) 


5e-82 


git4506285|ref 1 
NP_003470.l| 
(NM_003479) 


protein tyrosine 
phosphatase type 
IVA, member 2, 
isoform 1; protein 
tyrosine 
phosphatase IVA; 
protein tyrosine 
phosphatase IVA2; 

phosphatase of 
regenerating liver 
2 [Homo sapiens] 


167 


126/170 
(74%) 


144/170 
(84%) 


2e-72 


gi [1246236 (gb|A 
AB39331.l| 
(L48937) 


ptp-IVlb, PTP-IVl 
gene product [Homo 
sapiens] 


167 


125/170 

(73%) 


144/170 
(84%) 


4e-72 


gi [7513774 |pir| 
1 JC5981 


prenylated protein 
tyrosine 
phosphatase (EC 
3.1.3.-) 2 - mouse 


167 


124/170 
(72%) 


143/170 
(83%) 


2e-71 



The homology of these sequences is shown graphically in the ClustalW analysis 
shown in Table 22G. 







Table 226. ClustalW Analysis of NOV22 


1) NOV22a 


{SEQ 


ID NO:70) 


2) NOV22b 


(SEQ 


ID NO:71) 


3)gi 1 1142410 


(SEQ 


ID NO:315) 


4)gi|4503763 


(SEQ 


ID NO:316) 


5)gi [544335 


(SEQ 


ID NO: 3 17) 


6)gi 1 1706877 


(SEQ 


ID NO: 318) 


7)gi| 1094668 


(SEQ 


ID NO:319) 



NOV22a 

NOV22b 

gi 1 4506283 [ 

gij 1752892&[ 

gi 1 4506285 1 

gi 1 1246236 I 

gi [7513774 I 



NOV22a 

NOV22b 

gi [4506283 j 

gi 1 17528929 ( 

gi 1 4506285 I 



20 



30 



40 



50 
. [ . . 



MNgPAPV^gYlNMRF |ITKNPTnStLNKF! 
iMK^PAPVl^g^jNiyiRFgl TKNPTnJtLN 
MNRPAPVESviNMRFLITHNPTNATLNKFi 



MNR PAP VEgYiNMRFLI TKN?TNATLNKF| 
MNRPA?VE»Y&i^RFLITKN?TNATLNKFj 
MmPAPVE»^3^ 

mnrpapveHySmrflithnptnatlnkfi 



EELKKYGgTT|VRVCgATYD 
EELKKYGgTT|vRVC|ATYD| 
EELKKYGVITiVRVciATYDi 

eelkkygvttIvrvc^tyi)' 
selkkygvttIvrvcIatyd 
eslkkygvttIvrvcIatyd 



60 

IT 53 
IT 60 
IT 56 
IT 56 
<A 53 
<A 53 
A 53 
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gi 11246236 | 


54 


gi 1 7513774 | 


54 


NOV22a 


114 


HOV22b 


121 


gi [4506283 1 


117 


gi| 17528929] 


117 


gij 4506285] 


114 


gi 112462361 


114 


gi| 75137741 


114 




d1 
m 



5 =s3 

HI 



Table 22H lists the domain description from DOMAIN analysis results against 
NOV22. This indicates that the NOV22 sequence has properties similar to those of other 
proteins known to contain the protein tyrosine phosphatase domain and the protein tyrosine 
phosphatase catalytic domain motif* 



Table 22H Domain Analysis of NOV22 

gnl |Pfam|pfam00102, Y_jphosphatase, Protein- tyrosine phosphatase. 
CD-Length = 235 residues. 
Score - 44.3 bits (103), Expect = 6e-06 



PI THNPTNVTLNKFIEELKKYGATTI VRVCEATYDTTLVEKEG- - IHVLNWPFGDGAPPS 7 4 
II I II 1 I ^ I II I I I 



NOV22: 


17 


Sbjct: 


96 


NOV22 : 


75 


Sbjct: 


151 


NOV22 : 


129 


Sb j ct : 


211 



SLTYGDFTVTCVSVEKKKDDY TVRTIiELTNSGDDETRTVKHYHYTGWP-DHGVPES 



NQIVADWLHFVKIKFCEEPGCYIAVNCIVGLGKAPVLVALASV- 



-EGGMKHEDAVQ 
II + Ik 



150 

128 



■I I 



(SEQ ID NO:320) 



++ + +++ 



gnl I Smart I smart 004 04, PTPc_motif, Protein tyrosine phosphatase, catalytic 
domain motif 

CD-Length = 105 residues, 93.3% aligned 
Score = 39-7 bits (91), Expect = le-04 

NOV22 : 6 1 HVLNWPFGDGAPPSNQI VADWLHFVKIKFCEEPGCY- lAVNCIVGLGKAPVLVALASV- - 117 

I II Mi + II + + hi 11+ + 

Sbjct: 6 HYTGWPD-HGVPESPDSILEFLRAVKKSLNKSANNGPVWHCSAGVGRTGTFVAIDILLQ 64 

NOV22: 118 EGGMKHEDAVQFIGQKRSGAFKSK-QLLYLEKYH 150 (SEQ ID NO: 322) 

I + 11+ + +1 II I Ul + 

Sbjct: 65 QLEACTGEVDIFDIVKELRSQRPGAVQTLEQYLFLYRAL 103 (SEQ ID NO:323) 



Cellular processes involving growth, differentiation, transformation and metabolism 
10 are often regulated in part by protein phosphorylation and dephosphorylation. The protein 
tyrosine phosphatases (PTPs), which hydrolyze the phosphate monoesters of tyrosine 
residues, all share a common active site motif and are classified into 3 groups. These include 
the receptor-like PTPs, the intracellular PTPs, and the dual-specificity PTPs, which can 
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dephosphorylate at serine and threonine residues as well as at tyrosines. Diamond et al 
(1994) described a FTP from regenerating rat liver that is a member of a fourth class. The 
gene, which they designated Prll, was one of many immediate-early genes. Overexpression 
of Prll in stably transfected cells resulted in a transfonned phenotype, which suggested that it 
may play some role in tumorigenesis. By using an in vitro prenylation screen. Gates et al 
(1996) isolated 2 human cDNAs encoding PRLl homologs, designated PTP(CAAX1) and 
PTP(CAAX2)(PRL2), that are famesylated in vitro by mammalian famesyltprotein 
transferase. Overexpression of these FTPs in epithelial cells caused a transfonned phenotype 
in cultured cells and tumor growth in nude mice. The authors concluded that PTP(CAAX1) 
and PTP(CAAX2) represent a novel class of isoprenylated, oncogenic PTPs. Peng et al 
(1998) reported that the human PTP(CAAX1) gene, or PRLl, is composed of 6 exons and 
contains 2 promoters. The predicted mouse, rat, and human PRLl proteins are identical. Zeng 
et al (1998)detennined that the human PRLl and PRL2 proteins share 87% amino acid 
sequence identity. 

The protein similarity information, expression pattern, cellular localization, and map 
location for the protein and nucleic acid disclosed herein suggest that this Protein Tyrosine 
Phosphatase-like protein may have important structural and/or physiological functions 
characteristic of the Protein Tyrosin Phosphatase family. Therefore, the nucleic acids and 
proteins of the invention are useful in potential diagnostic and therapeutic applications and as 
a research tool. These include serving as a specific or selective nucleic acid or protein 
diagnostic and/or prognostic maricer, wherein the presence or amount of the nucleic acid or 
the protein are to be assessed. These also include potential therapeutic applications such as 
the following: (i) a protein therapeutic, (ii) a small molecule drug target, (iii) an antibody 
target (therapeutic, diagnostic, drug targeting/cytotoxic antibody), (iv) a nucleic acid useful in 
gene therapy (gene delivery/gene ablation), (v) an agent promoting tissue regeneration in 
vitro and in vivo, and (vi) a biological defense weapon. 

The nucleic acids and proteins of the invention have applications in the diagnosis 
and/or treatment of various diseases and disorders. For example, the compositions of the 
present invention will have efficacy for the treatment of patients suffering from: 
Cardiomyopathy, dilated, IK ; cancer; on Hippel-Lindau (VHL) syndrome, Alzheimer's 
disease, stroke, tuberous sclerosis, hypercalceimia, Parkinson's disease, Huntington's disease, 
cerebral palsy, epilepsy, Lesch-Nyhan syndrome, multiple sclerosis, ataxia-telangiectasia, 
leukodystrophies, behavioral disorders, addiction, anxiety, pain, neurodegeneration; Von 
Hippel-Lindau (VHL) syndrome, cirrhosis, transplantation as well as other diseases, disorders 
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and conditions. These materials are further useful in the generation of antibodies that bind 
immunospecifically to the novel substances of the invention for use in therapeutic or 
diagnostic methods. 

These antibodies may be generated according to methods knovra in the art, using 
prediction from hydrophobicity charts, as described in the "Anti-NOVX Antibodies" section 
below. The disclosed NOV22 protein has multiple hydrophilic regions, each of which can be 
used as an immunogen. In one embodiment, a contemplated NOV22 epitope is from about 
amino acids 10-22. In another embodiment, a contemplated NOV22 epitope is from about 
amino acids 25-32. In other specific embodiments, contemplated NOV22 epitopes are from 
about amino acids 38 to 39, 40 to 43, 50 to 52, 53 to 55, 57 to 60, 65 to 70, 75 to 80, 82 to 83, 
125 to 127, 128 to 132, 140 to 145 and 150 to 160. 

NOV23 

A disclosed NOV23 (designated CuraGen Acc. No. CG57228-01), which encodes a 
novel Aldo-Keto Reductase Family 7, member A3 -like protein and includes the 11 44 
nucleotide sequence (SEQ ID NO:73) is shown in Table 23A. An open reading frame for the 
mature protein was identified beginning with an ATG initiation codon at nucleotides 55-57 
and ending with a TAA stop codon at nucleotides 1 120-1 122. Putative untranslated regions 
are underlined in Table 23 A, and the start and stop codons are in bold letters. 



Table 23A. NOV23 Nucleotide Sequence (SEQ ID NO:73) 

TTCCGftCCGCTGCGCGCGGCTCCTGGGCTGTCACAGTCTCCCGTTGCCGCCGTCA TGTCCCGGCAGCTGTCGCG^ 

GCCCGGCCAGCCACGGTGCTGGGCGCCATGGAGATGGGGCGCCGCATGGACGCGCCCACa^CGCCGCAGTCACG 

CGCXSCCTTCCTGGAGCGCGGCCACACaSAGATAGACACGGCCTTCCTGTAmGCGA 

CnTOGCGGCCTGGGGCTCaSAATGGGCAGCAGCGACTGCAGAGTGAAAATO 

GGGAACTCXICrGAAGCCTGACAGTGTCCGATCCCaGCTGGAGACGTCACTGAAGCG 

GACCTCTTCTATCTACATGCTlCCrGACCACAGCGCCCCGGTGGAAGAGACACTGCGTGCCTGCCaVCCAGOTC^ 
CAGGAGGGCAAGTTCGTGGAGCTTGGCCTCTCCAACTATGCCGCCTGGGAAGTGGCCGAGATCTGTACCCTCTGC 
AAGAGCAACGGCTGGATCCTGCCCACTGTGTACCAGGGCATGTACAGCGCCACCACCaSGC^ 
OTCTTCCCCnSCCTCAGGCACTTTGGACTOaGGTTCTATGCCT 

TCTGGCAGCTTCTGGGGCACTCTGGGCCCXSGGGGCTGATTGCTGCCTTCCCGCAGGGG^ 

TACAAGTATGAGGACAAGGACGGGAAACAGCCCGTGGGCCGCTTOTTTGGGAOTCAGl^^ 

AATCAGTTCTGGAAGGAGCa^CCACTTCGAGGGCATTGCCCTGGTGGAGAAGGCCCTGCAGGCCGCGT^ 

AGCX3CPCCCAGCATGACCTCGGCCGCCCTCCGGTGGATGTACCACCACTCACAGCTGCSVGGGTGC 

GCGGTCATCCTGGGCATGTCCaGCCTGGAGCAGCTGGAGCAGAACrTGGCAGCGGC^ 

CCGGCTGTCGTGGACGCCTTTAATCAAGCCTGGCATTTGTTTGCCCACGAATQTCCCAACTACT^ 

CATTGTGGCTCAGGCTGCC 



The disclosed NOV23 nucleic acid sequence maps to chromosome 1 and has 632 of 
658 bases (96%) identical to a gb:GENBANK-ID:AF0406391acc:AF040639.1 mRNA fiom 
Homo sapiens (Homo sapiens aflatoxin Bl -aldehyde reductase mRNA, complete cds) (E = 
5.2e-^^^. 
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A disclosed NOV23 polypeptide (SEQ ID NO:74) is 355 amino acid residues in 
lengtihi and is presented using the one-letter amino acid code in Table 23B. The SignalP, 
Psort and/or Hydropathy results predict that NOV23 has a signal peptide and is likely to be 
localized to the microbody (peroxisome) with a certainty of 0.5268. In alternative 
embodiments, a NOV23 polypeptide is located to the mitochondrial matrix space with a 
certainty of 0.5048, the mitochondrial inner membrane with a certainty of 0.2262, or the 
mitochondrial intennembrane space with a certainty of 0.2262. The SignalP predicts a likely 
cleavage site for a NOV23 peptide between amino acid positions 8 and 9, i.e. at the sequence 
SRA-RP. 



Table 23B. Encoded NOV23 Protein Sequence (SEQ ID NO:74) 

MSRQLSRARPATVLGAMEMGRRMDAPTSAAVTRAFLERGHTEIIXrAFLYSrX5QSET^ 

KANPWIGNSIiKPDSVRSQLETSLKRLQCPRVDLFYLHAPDHSAPVEETLRACHQLHQEGKFVELGLSNYAAWEVAE 
ICTLCKSNGWILPTWQGMYSATTRQVETELFPCLRHFGLRFYAyNPLADQSPEGCGSFWGTLGPGADCCLPAGGL 
LTGKYKYEDKTCKQPVGRFFGTQWAEIYRNQFWKEmFEGIALVEKALQAAYGASAPSMTSAALRmYHHSQLQGA 
HGDAVIIjGMSSLEQLEQNIiAAAEEGPLEPAVVDAFNQAWHLFAHECPNYFI 

The NOV23 amino acid sequence was found to have 328 of 354 amino acid residues 
(92%) identical to, and 339 of 354 amino acid residues (95%) similar to, the 355 amino acid 
residue ptnr:SPTREMBL-ACC:Q9NUC3 protein from Homo sapiens (Human) (DJ657E1 1.3 
(ALDO-KETO REDUCTASE FAMILY 7, MEMBER A3 (AFLATOXIN ALDEHYDE 
REDUCTASE))) (E = 3.6e"'^^). 

NOV23 is predicted to be expressed in the following tissues because of the expression 
pattern of (GENBANK-ID: gb:GENBANK-ID:AF040639|acc:AF040639.1) a closely related 
Homo sapiens aflatoxin Bl -aldehyde reductase mRNA, complete cds homolog in species 
Homo sapiens: pancreas, exocrine, adrenal gland, colon, ovary, uterus, prostate, stomach, 
eye, lymph, parathyroid, marrow, hepatocellular carcinoma. 

NOV23 has homology to the amino acid sequences shown in the BLASTP data listed 
in Table 23C. 



Table 23C. BLAST results for NOV23 


Gene Index/ 
Identifier 


Protein/ Organism 


I*ength 
<aa) 


Identity 
(%) 


Positives 
(%) 


Expect 
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gil6941683 [erabj 

CAB72322.l| 

(AL035413) 


dJ657E11.3 (aldo- 
keto reductase 
family 7, member 
A3 (af latoxin 
aldehyde 

reductase) ) [Homo 
sapiens] 


355 


328/354 
(92%) 


339/354 
(95%) 


0.0 


gi 1 6912234 |ref| 
NP_036199.l| 
(NM_012067) 


aldo-keto reductase 
family 7, member A3 
(af latoxin aldehyde 
reductase) [Homo 
sapiens] 


331 


308/354 
(87%) 


317/354 
(89%) 


e-173 


gi [13627233 |ref 
|XP_001439,2| 
(XM_001439) 


aldo-keto reductase 
family 7, member A3 
(af latoxin aldehyde 
reductase) [Homo 
sapiens] 


331 


306/354 
(86%) 


316/354 
(88%) 


e-172 


gij 13627237 [ref 
|XP_001438.2 1 
{XM_001438) 


similar to 
AFLATOXIN Bl 
AIjDEHYDE REDUCTASE 
1 (AFBl-AR 1) 

( ALDOKETOREDUCTASE 
7) {H. sapiens) 

[Homo sapiens] 


330 


292/346 
(84%) 


302/346 
(86%) 


e-160 


gi 1 4502021 1 ref t 
NP_003680.l| 
(NM_003689) 


aldo-keto reductase 
family 7, member A2 
(af latoxin aldehyde 
reductase) ; 
aflatoxin betal 
aldehyde reductase 
[Homo sapiens] 


330 


291/346 
(84%) 


301/346 
(86%) 


e-159 



The homology of these sequences is shown graphically in the ClustalW analysis 
shown in Table 23D. 



Table 23D. ClustalW Analysis of NOV23 



1) NOV23 (SEQ ID NO: 74) 

2) gi I 6941683 (SEQ ID NOr324) 

3) gij 6912234 (SEQ ID NO:325) 

4) gij 13627233 (SEQ ID NO:326) 

5) gij 13627237 (SEQ ID NO:327) 

6) gi [4502021 (SEQ ID NO:328) 



NOV23 


1 


gi 


6941683 1 


1 


gi 


6912234 1 


1 


gi 


136272331 


1 


gi 


136272371 


1 


gi 


4502021 1 


1 



N0V23 

gi[ 6941683 I 
gi I 6912234 | 
gi 1 13627233 | 
gi 1 13627237 I 
gi I 4502021 I 




130 



150 



160 



170 



180 



185 



NOV23 


121 


gi 1 6941683 | 


121 


gi 1 6912234 | 


121 


gi 1 13627233 | 


121 


gij 136272371 


120 


gi[ 45020211 


120 


NOV23 


181 


gi 1 6941683 [ 


181 


gi| 6912234 j 


181 


gi 1 13627233 | 


181 


j-r-i i 1 *» OT O "7 i 

gx j / jij 1 1 


180 


gi 145020211 


180 


NOV23 


241 


gi| 6941683 1 


241 


gi 16912234] 


217 


gi| 13627233 1 


217 


gi [13627237 1 


216 


gi [4502021 1 


216 



NOV23 


301 


gi 


6941683 1 


301 


gi 


6912234 1 


277 


gi 


13627233 | 


277 


gi 


13627237 1 


276 


gi 


45020211 


276 



\/EETLRACHQLHQSGKFVELGLSNYAAWEVASICTLCKSNGWIL?TWQG^^/iAgTRQVE 
^/EETL?JiXKQLKQEGKFVELGLSNYA?iWEVAEICTLCKSNGWIL?TWQGMYNiJjTRQVE 
ETLRACHQLHQEGKF|eLGLSNYAAVJSVAE I CTLCKSNGW IL PTVYQGMYNAMTRQVE 
VEETLRACHQLHQEGKFVELGLSNYAAWSVAEICTLCKSNGWILPTVYQGMYNAgXRQVS 
VE ETLJaC^LHQEGKFVELGL SNYA-J^EVAS I CTL CKSNGW I L PTVYQGI^^A-JJTRQVE 
VE ETLIAC^LHQEGKFVELGL SNYaJi^' EVAS I CTL CKSNGW I L PTWQGMYNAflTRQVE 



180 
180 
180 
180 
179 
179 



190 



210 



220 



lELFPCLRHFGLRFYAmiPL^^^^^^HB^^^^^^gGGLLTGKYKYEDKDG 
TELFPCLRHFGLRFYAlNPLA^^^^^^^^^^^^^^GGLLTGKYKYEDKfG 

telfpclrhfglrfyajjnplaHI^^^^^^^^^^^^^Hgglltgkykyedkbg 
telfpclrkfglrfyaInpla^^^^^^^^^^^^^^^BgglltgkykyedkSg 
telfpclrhfglrfya&pla^^^^^^^^^^^^Hgglltgkykysdkdg 
te lf pclrhfglrf yaJnpla^^^^^^^^^^^^^^^Igglltgkykyedkbg 



23 0 



240 

'A 



gglltgkykyedkdg 
gglltgkykyedkSg 



240 
240 
216 
216 
215 
215 



250 



260 



270 



1 



I 



1 



280 



290 



300 

'A 



kqpvgrffgSEwaeIyrnSwkeh^ 

KQPVGRFFGW^AeIyRN^^ SAJJLRT^YKHS Q 

KQ PVGRFFGNiwAE|YRNRiwKEHHFEG I ALVEKALQAAYGAS APSMT S A|jLRV7r^KHS C 
KQPVGRFFGNlwAE^RNRiwKEHHFEGIA^^ 

KQ PVGRFFGNgN AE|YRNR|wKEKH?Egl ALVEKALQAAYGASAP S aBlRWMYHKS G 



300 
300 
276 
276 
275 
275 



310 



320 



330 



34 0 



350 



^QGAHGDAVILGMSSLEQLEQNLAAAEEGPLEPAVVDAFNQAWKLgAHECPNYF: 
LQGAKGDAVI LGMS S LS QL SQNL AAAEEG PLE PAWDAFNQA;\^L\.^HEC PNYF 

lqgakgdavi lgms s le qleqnl aaaeeg fle p awdafnqalvhlvahec pnyfr 
lqgakgdavilgmssleqleqnlaaasegplepawdafnqawklvShecpnyfr 
lqgakgdavilgmssleqleqnlaaSsegplepa^tvdafnqawklvahecpnyfr 
lqgakgdavilgmsslsqleqnlaa^segplepawdafnqawhlvahecpnyfr 



355 
355 
331 
331 
330 
330 



Table23E lists the domain description from DOMAIN analysis results against 
NOV23. This indicates that the NOV23 sequence has properties similar to those of other 
proteins known to contain these domains. 
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Table 23E Domain Analysis of NOV23 




gnl|Pfam|pfara00248, aldo_ket_red, Aldo/keto reductase family. This family 
includes a number of K+ ion chaomel beta chain regulatory domains - these 
are reported to have oxidoreductase activity- 

CD-Length = 282 residues, 86.9% aligned 

Score = 143 bits (360) , Expect - 2e-35 


N0V23I 
Sbjct : 


10 
8 


PATVLGAM^GRRmAPTSAAVTRAFLERGHTEIOTAFLYSIXSQSETIL^ -GLRMG 

1 it + i+l + +1 1+ 1+ III +1 +1+1 II 

PLLGLGTWKTPGRVDDEEAFEAVKAALDAGYRHFDTAEI Y GNEEEVGEAIKEALFEG 


66 
64 


N0V23 : 


67 


SSDCRVKIATKANPWIGNSLKPDSVRSQLETSLKRLQCPRVDLFYLHAPDHS APV 

1 t 1 4.1 t II illlMI lil-t-i-lll i + 


121 


Sbjct: 


65 


1 It +1 1 1 1 1 1 1 1 1 1 t 111*^111 1^ 
S6WREDI FITSKLW-NTFHSPKHVREALEKSLKRLGLDYVDLYLIHWPDPLKPGDDVPI 


123 


NOV23: 
Sbj Ct : 


122 
124 


EETJ^aaQLHQEGKFVELGLSNYAAWEVABICTLaCSNGWILPTVYQQMYSATTRQVET 

ill +1 +1 III +l + lk*l 1 + 1 1 1 1 1 II 
EETWKALEKLVDEGKVRSIGVSNFSAEQIiEEALSEAGK IPPWNQVEYHPYLRQ- -D 


181 
178 


NOV23: 


182 


ELFPCLRHFGLRFYAYNPLftDQSPEGOSSFWGTLGPGADCCLPAGGLIiTG^ 

11 + 1+ IIHI III 
ELRKFCKKHGI6VTAYSPL GSGI*L 


241 


Sbjct: 


179 


202 


NOV23 : 
Sbj Ct : 


242 
203 


QPVGRFFGTQWAEIYRNQFWKEHHFBGIALVEKALQAAYGASAPSMTSAALRWMYHHSQL 

+-^11 1 + 1 + 11+ + lllh 
DKFWSELGSPEL-LEDPAIJCKIAEKYGKTPAQVALRWVLQ 


301 
241 


NOV23: 


302 


QGAHGDAVIIX3MSS 315 (SEQ ID NO: 329) 

1 +11 1+ 
RGVSVIPKSST 252 (SEQ ID NO:330) 




Sbjct: 


242 





The masking of charged amino or carboxy groups by N-phthalidylation and O- 

phthalidylation has been used to improve the absorption of many drugs, including ampicillin 

and 5-fluorouracil. Following absorption of such prodrugs^ the phthalidyl group is hydrolyzed 

to release 2-carboxybenzaldehyde (2-CBA) and the pharmaceutically active compound; in 

humans, 2-CBA is further metabolized to 2-hydroxymethylbenzoic acid by reduction of the 

aldehyde group. The enzyme responsible for the reduction of 2-CBA in humans is identified 

as human aldo-keto reductase (AKR), ^ homologue of rat aflatoxin Bl -aldehyde reductase 

(rAFAR). Ireland et al cloned human aldo-keto reductase (AKR) from a liver cDNA library, 

and together with the rat protein, establishes the AKR7 family of the AKR superfamily. 

Unlike its rat homologue, human AFAR (hAFAR) appears to be constitutively expressed in 

human liver, and is widely expressed in extrahepatic tissues. The deduced human and rat 

protein sequences share 78% identity and 87% similarity. Although the two AKR7 proteins 

are predicted to possess distinct secondary structural features which distinguish them from 

the prototypic AKRl family of AKRs, the catalytic- and NADPH-binding residues appear to 

be conserved in both families. Certain of the predicted structural features of the AKR7 family 

members are shared with the AKR6 beta-subunits of voltage-gated K+-channels. In addition 

to reducing the dialdehydic form of aflatoxin Bl-8,9-dihydrodiol, hAFAR shows high 
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affinity for the gamma-aminobutyric acid metabolite succinic semialdehyde (SSA) which is 
structurally related to 2-CBA, suggesting tihat hAFAR could fonction as both a SSA reductase 
and a 2-CBA reductase in vivo. This hypothesis is supported in part by the finding that the 
major peak of 2-CBA reductase activity in human liver co-purifies with hAFAR protein. 

Alterations of the distal portion of the short arm of chromosome 1 (Ip) are among the 
earliest abnormalities of human colorectal tumors. Loss of heterozygosity analysis has 
previously revealed a smallest region of overlapping deletion (SRO) B, at lp35-36.1, deleted 
in 48% of sporadic tumors. From this region Nishi et al have cloned a gene encoding a 
protein of 330 amino acids that is 78% identical with the Rattus norvegicus aflatoxin Bl 
aldehyde reductase (Afar) and, therefore, likely represents its human homologue. In rat liver, 
Afar is strongly inducible by the antioxidants ethoxyquin and butylated hydroxyanisole, 
which protect the rat against aflatoxin Bl -induced liver tumorigenesis by detoxifying its 
genotoxic and cytotoxic dialdehyde. Human AFAR is expressed in a broad range of tissues 
and, therefore, is likely involved in endogenous detoxication pathways. Impaired detoxication 
of genotoxic aldehydes and ketones, which are involved in tumorigenesis of the colon and 
breast, may be a crucial factor both for tumor initiation and progression. 

The novel human Aldo-Keto Reductase Family 7, member A3-like Proteins of the 
invention contains aldo/keto reductase family domain and share 96% homology to human 
Aldo-Keto Reductase Family 7, member A3. Therefore it is anticipated that this novel 
protein has a role in the regulation of essentially all cellular functions and could be a 
potentially important target for drugs. Such drugs may have important therapeutic 
applications, such as treating numerous tumors. See, generally, Kelly et al. Endocrinology 
2000 Sep;141(9):3194-9; and Praml etcd,. Cancer Res 1998 Nov 15;58(22):5014-8. 

The protein similarity information, expression pattern, cellular localization, and map 

location for the NOV23 protein and nucleic acid disclosed herein suggest that this Aldo-Keto 

Reductase Family 7, member A3 like protein-like protein may have important structural 

and/or physiological functions characteristic of the Aldo-Keto Reductase Family 7 family. 

Therefore, the nucleic acids and proteins of the invention are useful in potential diagnostic 

and therapeutic applications and as a research tool. These include serving as a specific or 

selective nucleic acid or protein diagnostic and/or prognostic marker, wherein the presence or 

amount of the nucleic acid or the protein are to be assessed. These also include potential 

therapeutic applications such as the following: (i) a protein therapeutic, (ii) a small molecule 

drug target, (iii) an antibody target (therapeutic, diagnostic, drug targeting/cytotoxic 

antibody), (iv) a nucleic acid useful in gene therapy (gene delivery/gene ablation), (v) an 
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agent promoting tissue regeneration in vitro and in vivo, and (vi) a biological defense 
weapon. 

The NOV23 nucleic acids and proteins of tihe invention have applications in the 
diagnosis and/or treatment of various diseases and disorders. For example, the compositions 
of the present invention will have efficacy for the treatment of patients suffering fh>m: 
hemophilia, hypercoagulation, idiopathic thrombocytopenic purpura, autoimmune disease, 
allergies, immunodeficiencies, transplantation, graft versus host disease, allergies, 
lymphaedema, hypercalceimia, ulcers, fertility, endometriosis, diabetes. Von Hippel-Lindau 
(VHL) syndrome, pancreatitis, obesity, hypoparathyroidism, adrenoleukodystrophy , 
congenital adrenal hyperplasia, diabetes, tuberous sclerosis as well as other diseases, 
disorders and conditions. 

These materials are further useful in the generation of antibodies that bind 
immunospecifically to the novel substances of the invention for use in therapeutic or 
diagnostic methods. These antibodies may be generated according to methods known in the 
art, using prediction from hydrophobicity charts, as described in the "Anti-NOVX 
Antibodies" section below. The disclosed NOV23 protein has multiple hydrophilic regions, 
each of which can be used as an immunogen. In one embodiment, a contemplated NOV23 
epitope is from about amino acids 5 to 10. In another embodiment, a contemplated NOV23 
epitope is from about amino acids 20 to 35. In other specific embodiments, contemplated 
NOV23 epitopes are from about amino acids 40 to 48, 60 to 62, 75 to 100, 1 10 to 140, 170 to 
190, 195 to 215, 235 to 260, 292 to 305, 320 to 325, 340 to 342 and 348 to 349. 

NOV24 

A disclosed NOV24 (designated CuraGen Acc. No. CG57274-01), which encodes a 
novel Ral Guanine Nucleotide Exchange Factor 3-like protein and includes the 2171 
nucleotide sequence (SEQ ID NO:75) is shown in Table 24A. An open reading frame for the 
mature protein was identified beginning with an ATG initiation codon at nucleotides 26-28 
and ending with a TGA stop codon at nucleotides 2150-2152. Putative untranslated regions 
are underlined in Table 24A, and the start and stop codons are in bold letters. 

Table 24A. NOV24 Nucleotide Sequence (SEQ ID NO:75) 

CCACTGAGAGGGACGGGCGCCGGCCA TGGAGCGCACAGCAGGCAAAGAGCTGGCCGCACCGCTGCAGGACTGGGGT 
GAAGAGACCGAGGACGGCGCGGTGTACAGTGTCTCCCTGCGGCGGCAGCGCAGTCAGCGCTCAGATCACCAGAGGT 
CAGGAGTTGGACAGGCTCCCAGCCCCATTGCCAATACCTTCCTCCACTATCGAACCAGCAAGGTGAGGGTGCTGAG 
GGCAGCGCGCCTGGAGCGGCTGGTGGGAGAGTTGGTGTTTGGAGACCGTGAGCAGGACCCCAGCTTCATGCCCGCC 
TTCCTGGCCACCTACCGGACCTTTGTACCCACTGCCTGCCTGCTGGGCTTTCTGCTGCCACCAATGCCACCGCCCC 
CACCTCCCGGGGTAGAGATCAAGAAGACAGCGGTACAAGATCTGAGCTTCAACAAGAACCTGAGGGCTGTGGTGTC 
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AGTGCTGGGCTCCTGGCTGCAGGACCACCCTCAGGATTTCCGAGACCXICCCTGCCaVTTaS^^ 

CGAACCTTTCTGGGCTGGGCGGCCCCaVGGGAGTGCTGAGGCTCaiAAAAGC^^ 

AGGAGGCTGAGCGAGAGCAGGAAGAG6AGCCGCCTCAGGTGTGGTCAGGACCTCCCAGAGCT 

CCCAGACTCTTCAGAGGCCTGCGCGGAGGAAGAGGAAGGGCrCATGCCTCAAGGTCCCC^ 

GTGGACGAGGTGGCCGAGCAGCTGACCCTCATAGACTTGGAGCrcriTCrCCA^^ 

GCTCCGTGTGGTCGCAGAGGGACCGGCCGGGGGCTGCAGGCGCCTCCCCCACTGTGCGCGCCACCGTGGCCCAGTT 

CAACACCGTGACCGGCTGTGTGCTGGGTTCCGTGCTCGGAGCACCGGGCTTGGCCXK:CCCGCAGAGGGCGm^ 

CTGGAGAAGTGGATCCGCATCGCCCAGCGCTGCCGAGAACTGCGGAACTTCTCCTCCTTGCGCGCC^ 

CCCTGCAATCTAACCCCATCTACCGGCTCAAGCGCAGCTGGGGGGCAGTGAGCCGGGAACCGCTATCTACTTTCA^ 

GAAACTTTCGCAGATTTTCTCCGATGAGAACAACCACCTCAGCAGCAGAGAGATTCTTT^ 

GAGGGATCCCAAGAAGAGGACAACACCCCAGGCAGCCTGCCCTCAAAACCACCCCCAGGCCCTGTCC^ 

GCACCTTCCTTACGGACCTGGTTATGCTGGACACAGCCCTGCCGGATATGTTGGAGGGGGATCTCATTAA 

GAAGAGGAGGAAGGAGTGGGAGATCCTGGCCCGCATCCAGCAGCTGCAGAGGCGCTGTCAGAGCTACACCCTGAGC 

CCCCACCCGCCCATCCTGGCTGCCCTGCATGCCCAGAACCAGCTCACCGAGGAGCAGAGCTACCGGCTCTCCCGGG 

TCATTGAGCCACCAGCTGCCTCCTGCCCCAGCTCCCCACGCATCCGACGGCGGATCAGCCTCACCAAGCGTCTCAG 

TGCGAAGCTTGCCCGAGAGAAAAGCTCATCACCTAGTGGGAGTCCCGGGGACCCCTCATCCCCCACCTCCAGTGTG 

TCCCCAGGGTCACCCCCCTCAAGTCCTAGAAGCAGAGATGCTCCTGCTGGCAGTCCCCCGGCCTCTCCAGGGCCCC 

AGGGCCCCAGCACCAAGCTGCCCCTGAGCCTGGACCTGCCCAGCCCCCGGTCCCCCGTAACCCTAGACCCCTTTAG 

CGCCCGGGTCCCTCTACCGGCGCAGCAGAGCTCGGAGGCCCGTGTCATCCGCGTCAGCATCGACAATGACCACGGG 

AACCTGTATCGAAGCATCTTGCTGACCAGTCAGGACAAAGCCCCCAGCGTGGTCCGGCGAGCCTTGCAGAAGCACA 

ATGTGCCCCAGCCCTGGGCCTGTGACTATCAGCTCTTTCAAGTCCTTCCTGGGGACCGGCTCCTGATTCCTGACAA 

TGCCAACGTCTTCTATGCCATGAGTCCAGTCGCCCCCAGAGACTTCATGCTGCGGCGGAAAGAGGGGACCCGGAAC 

ACTCTGTCTGTCTCCCCAAGCTGAGQCAGCCCTGTCCTCTCCA 



The disclosed NOV24 nucleic acid sequence maps to chromosome 1 9 and has 1 552 of 
2159 bases (71%) identical to a gb:GENBANK-ID:AF2376691acc:AF237669.1 mRNA from 
Mus musculus (Mus musculus RalGDS-like protein 3 mRNA, complete cds) (E = 4.8e-l 89). 

A disclosed NOV24 polypeptide (SEQ ID NO:76) is 708 amino acid residues in 
length and is presented using the one-letter amino acid code in Table 24B. The SignalP, 
Psort and/or Hydropathy results predict that NOV24 does not have a signal peptide and is 
likely to be localized to the microbody (peroxisome) with a certainty of 0.3000. In 
alternative embodiments, a NOV24 polypeptide is located to the nucleus with a certainty of 
0.3000, the mitochondrial matrix space with a certainty of 0.1000, or the lysosome (lumen) 
with a certainty of 0.1 000. 



Table 24B. Encoded NOV24 Protein Sequence (SEQ ID NO:76) 



mertagkelaaplqdwgbbteix5avysvslrrqrsqrsdhqrsgvgqapspiantflhyrtskvrvlraarlerl 
vgeiivfgdreqdpsfmpaflatyrtfvptacllgfllppmppppppgveikktavqdlsfnknij^vvsv^ 
qdhpqdfrdppahsdlgsvrtflgwaapgsaeaqkaekij:.edfi,eeabreqeeeppqvwsgpprvaqtsdpdsse 
acaeeeeglmpqgpqlldfsvdevaeqltlidlelfskvrlyeclgsvwsqrdrpgaagasptvratvaqfntto 

GCVLGSVLGAPGIAAPQRAQRI^KWIRIAQRCRELRNFSSLRAILSALQSNPIYRI,KRSWGAVSREPLSTF 

qifsdennhlssreilfqeeategsqeedntpgslpskpppgpvpylgtfltdlvmldtalpdmlegdi^infekr 

rkeweilariqqlqrrcqsytlsphppilaalhaqnqlteeqsyrlsrvieppaascpssprirrrisltkrlsa 

klarekssspsgspgdpssptssvspgsppssprsrdapagsppaspgpqgpstklpiisldlpsprspvtldpfs 

arvplpaqqssearvirvsidl^hgnlyrsilltsqdkapsvvrralqkhnvpqpwacdyqlfqvlpgdrl^ 

nanvfyamspvaprdfmlrrkegtrntlsvsps 



The NOV24 amino acid sequence was found to have 577 of 709 amino acid residues 
(81%) identical to, and 629 of 709 amino acid residues (88%) similar to, the 709 amino acid 
residue ptnr:SPTREMBL-ACC:Q9JID4 protein from Mus musculus (Mouse) (RALGDS- 
LIKE PROTEIN 3) (E = 5.9e'^^^). 
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NOV24 is expressed in at least the following tissues: Mammary gland/Breast, Uterus, 
Thyroid, Cartilage, Adrenal Gland/Suprarenal gland. Kidney, Liver, Lymph node. Pancreas, 
Substantia Nigra, Epidermis, Cervix, Colon, Lung, Parathyroid G\md, and Whole Organism. 
Expression information was derived from the tissue sources of the sequences that were 
included in the derivation of the sequence of NOV24. 

NOV24 has homology to the amino acid sequences shown in the BLASTP data listed 
in Table 24C. 



Table 24C. BLAST results for NOV24 


Gene Index/ 
Identifier 


Protein/ Organism 


Length 
(aa) 


Identity 
{%) 


Positives 
(%) 


Expect 


gi| 15186754 |gb|AA 
K91126.l|AF239661 
_1 {AF239661) 


RalGDS- related 
effector protein 
of M-Ras [Mus 
musculus] 


709 


577/714 
(80%) 


629/714 
(87%) 


0.0 


gi 1 12963751 |ref|N 
P_076111.l| 
(KM_023622) 


RalGDS-like 
protein 3; Ral 
guanine - nucleoti de 
exchange factor 
[Mus musculus] 


709 


576/714 
(80%) 


628/714 
(87%) 


0.0 


gi|l2836390|dbj |B 
AB23634.l| 
(AK004876) 


RALGDS-LIKE 
PROTEIN 3 -data 
source :SPTR, 
source key:Q9JID4, 
evidence : ISS-putat 
ive [Mus musculus] 


343 


251/320 
(78%) 


279/320 
(86%) 


e-127 


gi 1 14717390 |ref|N 
P_055964.l| 
(NM 015149) 


RalGDS-liJce 
protein [Homo 
sapiens] 


768 


285/739 
(38%) 


409/739 
(54%) 


e-120 


gi|l0185686|gb|AA 
G14400 . 1 1 AF186798 
1 {AF186798) 


RalGDS- 1 ilce [Homo 
sapiens] 


768 


285/739 
(38%) 


409/739 
(54%) 


€-120 



The homology of these sequences is shown graphically in the ClustalW analysis 
shown in Table 24D. 



Table 24H. ClustalW Analysis of NOV24 



1) NOV24 (SEQ ID NOz76) 

2) gi 15186754 (SEQ ID NO: 331) 

3) gi 12963751 (SEQ ID NO:332) 

4) gi 12836390 (SEQ ID N0:333) 

5) gi 14717390 (SEQ ID N0:334) 

6) gi 10185686 (SEQ ID N0:335) 



NOV24 


1 


mertagkIl- 


gi 


10185686 1 


1 


--MKLLV^AK 


gi 


12836390| 


1 




gi 


12963751| 


1 


MERTAGI^LA 


gi 


14717390| 


1 




gi 


151867541 


1 


MERTAGKILA 




^GQAPSglANgF 56 
;GD<^PGhBv 58 
20 
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N0V24 

gi I 10185686 
gi I 12836390 
gi 1X2963751 
gi I 14717390 
gi I 15186754 



NOV24 

gi 1 10185686 I 
gi 1 12836390 I 
gi 1 12963751 1 
gi 1 14717390 I 
gi 1 15186754 I 



NOV24 

gi 1 10185686 I 
gi 1 128363901 
gi[ 129637511 
gi 1 14717390 I 
gi 1 15186754 I 



N0V24 

gi 1 10185686 I 
gi 1 12836390 j 
gi (129637511 
gi 1 14717390 j 
gi 1 15186754 I 



NOV24 

10185686 1 
12836390 1 
12963751 1 
14717390 I 
15186754 I 



gi 
gi 
gi 
gi 
gi 



N0V24 

gi|l0185686| 
gi |12836390 | 
gi jl296375l| 
gi 1 14717390 | 
gi 1 15186754 I 



N0V24 

gij 10185686 I 
gi 112836390 I 
gij 12963751 1 
gij 14717390 I 
gij 15186754 I 



NOV24 

gi 1 10185686 
gij 12836390 
gi 112963751 
gij 14717390 
gij 15186754 



120 

PPMP 116 
RYG 118 
26 

IFffllPPPP 117 

;ltodryg 118 

IfHpPPP 117 




310 



320 



330 




340 



350 



360 



QRAQ?^HKWIRIAQRCRELRNFSSLRAILSALQSNPI 

qra^iekwi|ia2J3cr2l|nfsslrai|salqsn|i 

q|aqriekwiriaqrcrelrnfsslrailsalqsnpi 

QRAQRIEKX^-IRIAQRCRELRNFSSLRAILSAL.QSMPI 
QRAg|lEKWl2ilA22CR2L|NFSSLRAl|SAL.QSN|l 
QRAQRISKX>JIRIAQRCRELRNFSSLRAILSALQSNPI 



353 
340 
257 
356 
340 
356 



370 



400 



410 



420 



412 

iSVKE NQKRj 400 
PS^g^ 313 
PS^^^ 412 
'ANLjgSSVKENQKI^ 4 00 
412 




192 



NOV24 

gi I 10185686 
gi I 12836390 
gi 1 12963751 
gi I 14717390 
gij 15186754 



NOV24 

gi 1 10185686 I 
gi 1 12836390 I 
gi 1 12963751 1 
gi 1 147173901 
gij 15186754 t 



N0V24 

gi 1 10185686 I 
gij 12836390 I 
gij 12963751 I 
gij 14717390 I 
gij 15186754 I 



575 
578 
343 

PRgREPPPP^PPASPGgQ 576 
:EsfcsEAEEfflITP^OTBD 578 
PrSrEPPPpSpPASPgBq 576 



NOV24 


603 


gi 


10185686 


639 


gi 


12836390 


343 


gi 


12963751 


603 


gi 


14717390 


639 


gi 


15186754 


603 











|Sl|LTiQDKB?|v 
|sliLTgQDKS?|v 






662 
698 
343 
662 
698 
662 



73 0 



740 



750 



780 




790 



NOV24 


708 




708 


gi 


10185686 


759 


WSXRHSKITL 


768 


gi 


12836390 


343 




343 




gi 


12963751 


709 




709 




gi 


14717390 


759 


WSNRHSKITL 


768 


gi 


15186754 


709 




709 



Table 24E lists the domain description from DOMAIN analysis results against 
NOV24. This indicates that the NOV24 sequence has properties similar to those of other 
proteins known to contain these Ras-related domains. 



Table 24E Domain Analysis of NOV24 

gnl I Smart I smart 00 147 , RasGEF, Guanine nucleotide exchange factor for Ras- 
like small GTPases 

CD-Length = 242 residues/ 98.8% aligned 

Score = 216 bits (551) , Expect = 3e-57 



NOV24 : 


241 


LLDFSVDEVAEQLTLIDLELFSKTOLYECLGSVWSQRDRPGAAGASPTVRATVAQFNTVT 


300 






II iMimi-i III 1- Mini ^1 + + + + +111^ 




Sbjct: 


1 


LLL1J3PKELAEQLTLLDFELFRKIDPSELLGSVWGKRSKKS- - PSPLNLERFIERENEVS 


58 


N0V24r 


301 


GCVLGSVLGAPGLAAPQRAQRLEKWIRIAQRCRELRNFSSLRAILSALQSNPIYRLKRSW 


360 






1 +1 Ik 1 hK^l^ MM Ihll IIHIl hi! Ill+^l 




Sb j Ct r 


59 


NWVATEIIJCQTTP- -KDRAEIiSKFIQVAKHCRELNNFNSIMAIVSAIiSSSPISRIJCKTW 


116 



193 



NOV24: 361 GAVSREPLSTFRKLSQIFSDENNHLSSREILFQEEATEGSQEEDNTPGSLPSKPPPGPVP 420 

+ + I H I ^ II I I ^1 

Sbjct: 117 EKLPSKYKKLFEELEELLDPSRNFKNYREALSSCN I^PCIP 157 

NOV24: 421 YLGTFLTDLVMLDTALPDMLEGDLINFEKRRKEWEIIiARIQQLQRRCQSYTLSPHPPILA 480 

HI I II ^1 II k hlllllll HI IHII I I 1 h - 
Sbjct: 158 FLGVIiKDLTFIDEGNPDFI,KN6LVNFEKRRKIJUKIIJlEIRQLQS--QPYNI*RPNRS 215 

NOV24r 481 AL--HAQNQLTEEQ>SYRLSRVIE 501 (SEQ ID NO:336) 

^1 + + Ml III 11 
Sbjct: 216 SLLQQSLDSLPEENELYELSLRIE 239 (SEQ ID NO:337) 



gnl I Pf am|pf ara00617, RasGEF, RasGEF domain. Guanine nucleotide exchange 
factor for Ras-like small GTPases. 

CD-Length 188 residues, 100.0% aligned 

Score - 181 bits (459) , Expect = le-46 

NOV24: 242 LDFSVDEVAEQLTLIDLELFSKVRLYECLGSVWSQRDRPGAAGASPTVRATVAQFNTVTG 3 01 

1 khlllk^ III k ^llll II ^^1 II + h II H 

Sb j Ct : 1 LLLDPIiELAKQLTLLEHELFKKIDPFECLGQVWGKKY- -GKNERSPNIDKTIKNFNQLTN 58 
NOV24: 302 C^GSVLGAPGLAAPQRAQRLEKWIRIAQRCRELRNFSSLRAILSALQSNPIYRLKRSWG 361 

I +11+ ++1 + 1++! nil Ihll IIHIt | + |iiit|++| 

Sbjct: 59 FVGTTirJ[^--TDPKKRAELIQKFIQVADHC3EiELNNFNSL^ 116 

NOV24: 362 AVSREPLSTFRKLSQIFSDENNHLSSREILFQEEATEGSQEBDNTPGSLPSKPPPGPVPY 421 

I + I I -^1^^+ + I + IIH I I Ih 

Sbjct: 117 YVPPQSLKLFEELNKLMDSDRNFSNYRELL KSIFPLPCVPF 157 

NOV24: 422 LGTFLTDLVMLDTALPDMLEGDLINFEKRRK 452 (SEQ ID NO:338) 

I HHI k II It +1 + 11 WW 
Sbjct: 158 FGVYI*SDLTFLEEGWPDFLErai.VNFSKRRK 188 (SEQ ID NO: 33 9) 



gnl |Pfam|pfam00788, RA, Ras association (RalGDS/AF- 6) domain. RasGTP 
effectors (in cases of AF6, canoe and RalGDS) ; putative RasGTP effectors in 
other cases. Recent evidence (not yet in MEDLINE) shows that some RA domains 
do NOT bind RasGTP. Predicted structure similar to that determined, and that 
of the RasGTP -binding domain of Raf kinase. 

CD-Length = 92 residues, 96.7% aligned 

Score = 62.4 bits (150), Expect = 8e-ll 



NOV24 : 


615 


VIRVSIDNDH- GNLYRS ILLTSQDKAPS WRRALQKHNVPQPWACDYQLFQVLPGDRLL - 


672 






IHl - \ k+l + I II Ih IIH - +1 1 HI 




Sb j Ct : 


4 


VLRVYFQDLKPGVAYKTIRVSSEDTAPDWQLALEKFRLDDEDPEEYALVEVLSGDKERK 


63 


NOV24 : 


673 


IPDNANVFYAM SPVAPRDFMLRRKE 697 (SEQ ID NO: 340) 








1 hhl++ 




Sbjct: 


64 


LPDDENPLQLRLNLPRDGLSLRFLLKRRD 92 (SEQ ID NO: 341) 





gnl I Smart j smart 003 14, RA, Ras association (RalGDS/AF-6) domain; RasGTP 
effectors (in cases of AF6, canoe and RalGDS) ; putative RasGTP effectors in 
other cases. Kalhammer et al . have shown that not all RA domains bind 
RasGTP. Predicted structure similar to that determined, and that of the 
RasGTP-binding domain of Raf kinase. Predicted RA domains in PLC210 and 
norel found to bind RasGTP. Included outliers (Grb7, Grbl4, adenylyl 
cyclases etc.) 

CD- Length = 90 residues, 95.6% aligned 

Score = 56,2 bits (134), Expect = 6e-09 

NOV24: 615 VIRVSIDNDH(^«LYRSILLTSQDKAPSVVRRALQKHNVPQPWACDYQLFQVLPGDRLL-I 673 

kll III I+++ ++ + I h++ hi ++ +1 I +1 I + + 

Sbjct: 4 VLRVYFD-DPGGTYKTLRVSKRTTARDVIQQLLEKFHLTDDPE-EYVLVEVKEGGKERVL 61 
NOV24: 674 PDNANVFYAM SPVAPRDFMLRRKE 697 (SEQ ID NO:342) 

+ + |-»-||-'-++ 
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Sbjct: 62 LPDEKPLQLQKLWPRQGSNLRFVI^RKRD 89 {SEQ ID NO:343) 

gnl I Smart I smart 0022 9, RasGEFW, Guanine nucleotide exchange factor for Ras- 
like GTPases; N-terminal motif; A siobset of guanine nucleotide exchange 
factor for Ras-like small GTPases appear to possess this domain N-terminal 
to the RasGef {Cdc25-like) domain. The recent C3rystal structure of Sos shows 
that this domain is alpha-helical and plays a ^'purely structural role" 
(Nature 394, 337-343). 

CD-Length = 132 residues, 56.1% aligned 

Score = 47.8 bits (112), Expect = 2e-06 

NOV24: 87 DPSFMPAFIATYRTFVPTACLLGFLIiPPMPPPPPPGVEIKKTAVQDLSFNKN^RAWSVL 146 

Ihh II 111^^ I II II IMII + h^^l 

Sbjct: 26 DPTFVETFIJ^TYRSPITTQEIJ^QKIiLYRraAIPPEGVE-DIWVKEKVNPRRIQNRA^ 84 
NOV24: 147 GSWLQDHPQDFRDPP 161 (SEQ ID NO:344) 

Ml ^ I 

Sbjct: 85 RLWVENYWQDFEEDP 99 (SEQ ID NO: 345) 



RasGEF (See Interpro IPR001895; RasGEF domain) is a member of the Guanine- 
nucleotide dissociation stimulators CDC25 family. Ras proteins are membrane-associated 
molecular switches that bind GTP and GDP and slowly hydrolyze GTP to GDP. The balance 
between the GTP bound (active) and GDP bound (inactive) states is regulated by the opposite 
action of proteins activating the GTPase activity and that of proteins which promote the loss 
of bound GDP and the uptake of fresh GTP. The latter proteins are known as guanine- 
nucleotide dissociation stimulators (GDSs) (or also as guanine-nucleotide releasing (or 
exchange) factors (GRFs)). Proteins that act as GDS can be classified into at least two 
families, on the basis of sequence similarities, the CDC24 family (see INTERPRO 
IPR001331 ) and the CDC25 family. 

The size of the proteins of the CDC25 family ranges from 309 residues (LTEl) to 
1596 residues (sos). The sequence similarity shared by all tiiese proteins is limited to a region 
of about 250 amino acids generally located in their C-terminal section (currently the only 
exceptions are sos and ralGDS where this domain makes up the central part of the protein). 
This domain has been shown, in CDC25 an SCD25, to be essential for the activity of these 
proteins. 

Ras association (RalGDS/AF-6) domain, see RasGEFN (Interpro IPR00065 1 ; 
Guanine nucleotide exchange factor for Ras-1)* The Guanine nucleotide exchange factor for 
Ras-like GTPases; N-terminal motif is found in several guanine nucleotide exchange factors 
for Ras-like small GTPases, and lies N-terminal to the RasGef (Cdc25-like) domain. Proteins 
belonging to this family include guanine nucleotide dissociation stimulator, which stimulates 
the dissociation of GDP from the Ras-related RalA and RalB GTPases and allows GTP 
binding and activation of the GTPases; GTPase-activating protein (GAP) for Rhol and Rho2, 
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which is involved in the control of cellular morphogenesis; and the yeast cell division control 
protein, which promotes the exchange of Ras-bound GDP by GTP and controls the level of 
cAMP when the cell division cycle is triggered. Also included is the son of sevenless protein, 
which promotes the exchange of Ras-bound GDP by GTP during neuronal development. 

This indicates that the sequence of the invention has properties similar to those of 
other proteins known to contain these domains and similar to the properties of these domains. 

The small GTPase Rit is a close relative of Ras, and constitutively active Rit can 
induce oncogenic transformation. Although the effector loops of Rit and Ras are highly 
related, Rit fails to interact with the majority of the known Ras candidate effector proteins, 
suggesting that novel cellular targets may be responsible for Rit transforming activity. To 
gain insight into the cellular function of Rit, Shao and Andres {J Biol Chem 2000^275:26914- 
24) searched for Rit-binding proteins by yeast two-hybrid screening. They identified the C- 
terminal Rit/Ras interaction domain of a protein and designated as RGL3 (Ral GEF-like 3) 
that shares 35% sequence identity with the known Ral guanine nucleotide exchange factors 
(RalGEFs). RGL3, through a C-terminal 99-amino acid domain, interacted in a GTP- and 
effector loop-dependent manner with Rit and Ras. Importantly,RGL3 exhibited guanine 
nucleotide exchange activity toward the small GTPase Ral that was stimulated in vivo by the 
expression of either activated Rit or Ras. These data suggest that RGL3 functions as an 
exchange factor for Ral and may serve as a downstream effector for both Rit and Ras (OMIM 
number: 601619). 

Ras-related GTPases (see OMIM 190020) participate in signaling for a variety of 
cellular processes and are regulated in part by guanine nucleotide dissociation stimulators 
(GDSs, or exchange factors). Albright et al. (1993) used sequences derived firom the yeast 
rasGDS proteins as probes and cloned cDNAs encoding a novel murine GDS protem. The 
protein stimulated the dissociation of guanine nucleotides from the ralA (179550) and ralB 
(179551) GTPases. The protein, designated RalGDS by them, was at least 20-foId more 
active on the ralA and ralB GTPases than any other GTPases tested. The 3.6-kb ralGDS 
mRNA and the 1 15-kD ralGDS protein were found in all tissues examined. 

Hofer et al. (1994) used a yeast 2-hybrid system to identify proteins in human that 
interact with Ras and isolated a gene encoding RALGDS, a protein which had previously 
been identified in mouse by Albright et al. (1993) as a guanine nucleotide exchange factor for 
the Ras-like molecule Ral. Hofer et al. (1994) reported that the interaction with Ras and Ras- 
like molecules was mediated by the C-terminal noncatalytic segment of RALGDS. They 
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demonstrated that the interaction of the RALGDS C-termtoal region with Ras is specific and 
dependent on the activation of Ras by GTP. 

Independently, Spaargaren and Bischoff (1994) used a yeast 2-hybrid system to 
screen for proteins that bind to R-ras (165090). From this screen they obtained several clones 
that encoded the C-terminal region of the guanine nucleotide dissociation stimulator for Ral 
(RALGDS). Using the 2-hybrid system Spaargaren and Bischoff (1994) showed that the R- 
ras-binding domain of RALGDS interacts with H-ras, K-ras (190070), and Rap (RAPl A; 
179520). Their data further indicated that RALGDS is a putative effector molecule for R-ras, 
H-ras, K-ras, and Rap. 

Urano et al. (1996) demonstrated that ras-H (H-ras), R-ras, and Rapl A have the 
capacity to bind RalGDS in mammalian cells; however, only H-ras activates RalGDS. From 
these and other data they concluded that activation of RalGDS and its target Ral constitutes a 
distinct downstream signaling pathway from H-ras that potentiates oncogenic transformation. 

Schuler et al. (1 996) generated a map of the human genome facilitated by the 
availability of expressed sequence tags (ESTs) mapping to radiation hybrid panels (see NCBI 
World Wide Web home page for more information). In their on-line map, they reported that 
ESTs (e.g., dbEST 785621 ; AA147088 ) representing a human homolog for the RALGDS 
gene map to chromosome 9q34 in the interval between D9S1 59 and D9S164 (see 
SCIENCE96 stSG2452). 

The protein similarity information, expression pattern, cellular localization, and map 
location for the NOV24 protein and nucleic acid disclosed herein suggest that this Ral 
Guanine Nucleotide Exchange Factor 3-like protein may have important structural and/or 
physiological functions characteristic of the guanine nucleotide exchange factors family. 
Therefore, the nucleic acids and proteins of the invention are useful in potential diagnostic 
and therapeutic applications and as a research tool. These include serving as a specific or 
selective nucleic acid or protein diagnostic and/or prognostic marker, wherein the presence or 
amount of the nucleic acid or the protein are to be assessed. These also include potential 
therapeutic applications such as the following: (i) a protein therapeutic, (ii) a small molecule 
drug target, (iii) an antibody target (therapeutic, diagnostic, drug targeting/cytotoxic 
antibody), (iv) a nucleic acid useful in gene therapy (gene delivery/gene ablation), (v) an 
agent promoting tissue regeneration in vitro and in vivo, and (vi) a biological defense 
weapon. 

The NOV24 nucleic acids and proteins of the invention have applications in the 

diagnosis and/or treatment of various diseases and disorders. For example, the compositions 
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of the present invention will have efficacy for the treatment of patients suffering from: 
cancer, trauma, tissue regeneration (in vitro and in vivo), viral/bacterial/parasitic infections, 
immunological disease, respiratory disease, gastro-intestinal diseases, reproductive health, 
neurological and neurodegenerative diseases, bone marrow transplantation, metabolic and 
endocrine diseases, allergy and inflammation, nephrological disorders, cardiovascular 
diseases, muscle, bone, joint and skeletal disorders, hematopoietic disorders, urinary system 
disorders as well as other diseases, disorders and conditions. 

These materials are further useful in the generation of antibodies that bind 
immunospecifically to the novel substances of the invention for use in therapeutic or 
diagnostic methods. These antibodies may be generated according to methods known in the 
art, using prediction from hydrophobicity charts, as described in the "Anti-NOVX 
Antibodies" section below. The disclosed NOV24 protein has multiple hydrophilic regions, 
each of which can be used as an immunogen. In one embodiment, a contemplated NOV24 
epitope is from about amino acids 2 to 40. In another embodiment, a contemplated NOV24 
epitope is from about amino acids 65 to 90. In other specific embodiments, contemplated 
NOV24 epitopes are from about amino acids 1 15 to 120, 170 to 175, 195 to 230, 280 to 290, 
310 to 320, 360 to 405, 460 to 475, 495 to 570, 605 to 660 and 690 to 695. 

NOV25 

A disclosed NOV25 (designated CuraGen Acc. No. CG57276-01), which encodes a 
novel Endolyn-like protein and includes the 717 nucleotide sequence (SEQ ID NO:77) is 
shown in Table 25A. An open reading frame for the mature protein was identified beginning 
with an ATG initiation codon at nucleotides 83-85 and ending with a TAA stop codon at 
nucleotides 668-670. Putative untranslated regions are underlined in Table 25A, and the start 
and stop codons are in bold letters. 

Table 25A. NOV25 Nucleotide Sequence (SEQ ID NO:77) 

GAGGCGGCX3CCGCAGGQGATTGAGGGGTTGACTGAQCGTTGCGAGCCTTAGCTTTCTC 
ACACGA TCTOXrGGCTCTCCCGCTC^CTGCTTTGGGCCGCCACCTGCCT 

AAGAACACGACCCAGCACCCGAACGTGAaSAOTTTAGCGCCCATCTCCAACGTAAAATCATTG^ 
TCCCCCCT^CTCCCCAGAAACCTGTGAAGGTCGAAACAGCTGCGTTTCCTGTTTTAATGTTAGC^ 
CCTGCTTTTGGATAGAATGTCCCCCAACAGATGAGAGCTATTGTTCACATAACTCAACAGTO 
GGGAACACGACAGACTTCTGTTCCGGTAAGTATTCOTATTGGCrGCTTG^ 

GCCCTCCCCTTCTACAACTTCCAAGACAGTTACTACATCAGGTACAACMATAACACTGTGACT^ 
CTGTGCQAAAGTCTACCTTTGATGCAGCCAGTTTCATTGGAGGAATTGTCCT^ 
TTCTTTCTTTATAAATTCTGCAAATCTAAAGAACGAAATTACCACACTCTGTA AACAGACCCAT^^ 
ACTGGTGATTCATTTGTGTAACTC 



198 



The disclosed NOV25 nucleic acid sequence maps to chromosome 6 and has 495 of 
649 bases (76%) identical to a gb:GENBANK-ID:RN0238574|acc:AJ238574.1 mRNA from 
Rattus norvegicus (Rattus norvegicus mRNA for endolyn) (E = 7.0e-^^). 

A disclosed NOV25 polypeptide (SEQ ID NO:78) is 195 amino acid residues in 
length and is presented using the one-letter amino acid code in Table 25B. The SignalP, 
Psort and/or Hydropathy results predict that NOV25 has a signal peptide and is likely to be 
localized to fte plasma membrane with a certainty of 0.4600. In alternative embodiments, a 
NOV25 polypeptide is located to the endoplasmic reticulum (membrane) with a certainty of 
0.2800, the lysosome (membrane) with a certainty of 0.2000, or the endoplasmic reticulum 
(lumen) with a certainty of 0.1000. The SignalP predicts a likely cleavage site for a NOV25 
peptide between amino acid positions 23 and 24, te. at the sequence LSA-DK. 



Table 25B. Encoded NOV25 Protein Sequence (SEQ BD NO:78) 

MSRLSRSLLWAATCLGVIiCTLSADKNTTQHPNVTri^PIS]SrVKSL^ 

FWIECPPTDESY CSHNSTVSDCQVGNTTDFCSGKYS YWLLGS I PAKPTVQPSPSTTSKTVTTSGTTNNTVTPTSQPV 
RKSTFDAASFIGGI VliVIiGVQAVI FFLYKFCKSKERNYHTL 

The NOV25 amino acid sequence was found to have 1 10 of 195 amino acid residues 
(56%) identical to, and 136 of 195 amino acid residues (69%) similar to, the 195 amino acid 
residue ptnr:SPTREMBL-ACC:Q9QX82 protein from Rattus norvegicus (Rat) (ENDOLYN 
PRECURSOR) (E = 7.2e'^2). 

NOV25 is predicted to be expressed in the following tissues because of the expression 
pattem of (GENBANK-ID: gb:GENBANK-ID:RN0238574|acc:AJ238574.1), a closely 
related Rattus norvegicus mRNA for endolyn homolog in species Rattus norvegicus: testis, 
pancreas, lung, colon, kidney, skin, and breast. 

Homologies to any of the above NOV25 proteins will be shared by other NOV25 
proteins insofar as tiiey are homologous to each other. Any reference to NOV25 is assumed 
to refer to NOV25 proteins in general, unless otherwise noted. 

NOV25 also has homology to the amino acid sequences shown in the BLASTP data 
listed in Table 25C. 



Table 25C. BLAST results for NOV25 


Gene Index/ 
Identifier 


Protein/ Organism 


Length 
(aa) 


Identity 
(%> 


Positives 
(%) 


Ba^ect 
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gi|l24S3942 |gb|A 
AG53905.l| 
(AF299340) 


CD164 isoform 
delta 4 [Homo 
sapiens] 


184 


70/199 
(85%) 


174/199 
(87%) 


le-63 


gi| 9230741 tgb|AA 
F85965.l|AF26327 
9 1 (AF263279) 


CD164 [Homo 
sapiens] 


197 


1 /u/ ly y 
(85%) 


(87%) 




gi j 3 941728 1 gi> |AA 
C82473.1 [ 
(AF106518) 


SiaxOIuLlC.Xll V^UXo^ 

[Homo sapiens] 


178 


154/198 
(77%) 


158/198 
(79%) 


le-60 


gi 1 5174407 |ref|N 
P 006007. l| 
(NM_006016) 


CD164 antigen, 
sialomucin; 
Sialomucin CD164 
[Homo sapiens] 


189 


147/179 
(82%) 


153/179 
(85%) 


3e-49 


gi 1 13929154 |ref| 
NP_114000. l| 
(NM 031812) 


endolyn [Rattus 
norvegicus] 


195 


110/197 
(55%) 


136/197 
(68%) 


2e-34 



The homology of these sequences is shown graphically in the ClustalW analysis 
shown in Table 25D. 



Table 25D. ClustalW Analysis of NOV25 



1) NOV25 

2) gi|l2483942 

3) gij9230741 

4) gi|3941728 

5) gi|5174407 

6) gi 1 13929154 



(SEQ ID NO: 78) 
{SEQ ID NO: 346) 
(SEQ ID NO: 347) 
(SEQ ID NO: 34 8) 
(SEQ ID NO: 349) 
(SEQ ID l!K5:350) 



NOV25 

gx 1 12483942 I 
gi I 9230741 1 
gij 3941728 I 
gi ) 5174407 | 
gij 13929X54 I 



NOV25 

gi 1 12483942 I 
gi I 9230741 1 
gij 3941728 j 
gij 5174407 I 
gi 1 13929154 I 



NOV25 

gi 1 12483942 I 
gij 9230741 1 
gij 3941728 I 
gij 5174407 1 
gij 13929154 I 



NOV25 



gi 
gi 
gi 
gi 
gi 



12483942 I 
9230741 1 
3941728 j 
5174407 I 
139291541 
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The sialomucins appear to play 2 key but opposing roles in vivo: the first as 
cytoprotective or antiadhesive agents, and the second as adhesion receptors. Despite their 
common functions, these mucins encompass a heterogeneous group of secreted or membrane- 
associated proteins. See OMIM 603356, SIALOMUCIN or CD164. 

Using 2 monoclonal antibodies and a retroviral expression cloning strategy, 
Zannettino et al. (Zannettino, et al.. Blood 92: 2613-2628, 1998, PubMed ID: 9763543) 
isolated a cDNA encoding a novel transmembrane isoform of the mucin-like glycoprotein 
MGC-24, which they designated CD164. The mature CD164 protein contains 178 amino 
acids, has a molecular mass of 80 to 90 kD, and is extremely rich in serine and threonine. 
CD164 is expressed by human CD34+ hematopoietic progenitor cells. Zannettino et al. 
(1998) found that the CD164 receptor appears to play a role in hematopoiesis by facilitating 
the adhesion of CD34+ cells to bone marrow stroma and by negatively regulating CD34+ 
hematopoietic progenitor cell growth. They found that these ftmctional effects are mediated 
by at least 2 spatially distinct epitopes, defined by specific monoclonal antibodies. Watt et al. 
(Watt, et al.. Blood 92: 849-866, 1998, PubMed ID: 9680353) showed that these and other 
CD164 monoclonal antibodies show distinct patterns of reactivity when analyzed on 
hematopoietic cells from normal human bone marrow, umbilical cord blood, and peripheral 
blood. Expression of the CD164 epitope was found on developing myelomonocytic cells in 
bone marrow, being downregulated on mature neutrophils but maintained on monocytes in 
the peripheral blood. Watt et al. (1998) extended these studies further to identify PAC clones 
containing the CDl 64 gene and used the clone to localize the CD164 gene specifically to 
6q21 by fluorescence in situ hybridization. 

Endolyn is a membrane protein found in lysosomal and endosomal compartments of 

mammalian cells. Unlike 'classical' lysosomal membrane proteins, such as lysosome- 

associated membrane protein (lamp)-l, it is also present in a subapical compartment in 

polarized WIF-B hepatocytes. The structural features that determine sorting of endolyn are 

unknown (1). Ihrke et al. have identified a rat endolyn cDNA by expression screening. The 

cDNA encodes a ubiquitously expressed type I membrane protein with a short cytoplasmic 

tail ofl3 amino acids and many putative sites for N- and O-linked glycosylation in the 

predicted luminal domain. Endolyn is closely related to two human mucin-like proteins, 

multi-glycosylated core protein (MGC)-24 and CD 164 (MGC-24v), expressed in gastric 

carcinoma cells and bone marrow stromal and haematopoietic precursor cells respectively. 

The predicted transmembrane and cytoplasmic tail domains of endolyn, as well as parts of its 

luminal domain, also show some similarities with lamp-1 and lamp-2. Like these and other 

201 



known lysosomal membrane proteins, endolyn contains a YXXO motif at the C-terminus of 
its cytoplasmic tail (where O is a bulky hydrophobic amino acid), but with no preceding 
glycine. Nonetheless, the last ten amino acids of this tail, when transplanted on to human 
CDS, caused efficient targeting of the chimaeric protein to endosomes and lysosomes in 
trmsfected normal rat kidney cells (1). 

Karlsson et aL demonstrated a genetically determined polymorphism of a human 
urinary mucin by the separation technique of SDS polyacrylamide gel electrophoresis 
followed by detection with radioiodinated lectins (2). Peanut agglutmin was the most 
effective lectin; hence, the proposed designation peanut-reactive urinary mucin (PUM). 
Karlsson et aL identified 4 common alleles with codominant inheritance. The same 
polymorphic protein is expressed in other normal and malignant tissues of epithelial origin 
including the mammary gland. Variation in white cell DNA detected with a cDNA probe for 
mammary mucin exactly matches the variation of the protein as demonstrated after 
electrophoresis using a series of monoclonal antibodies; studies in 2 large families 
demonstrated the precise correspondence. Gendler et al. studied the polymorphic epithelial 
mucin present on the surface of human mammary cells. It is developmentally regulated and 
aberrantly expressed in breast cancer (3). Lan et al, used a monospecific polyclonal 
antiserum against deglycosylated human pancreatic tumor mucin to select clones from a 
cDNA library developed from a human pancreatic tumor cell line (4). The close similarity of 
the cDNA sequence and the deduced amino acid sequence of pancreatic mucin to those of 
breast tumor mucin, as reported by Gendler et al. (3) and others, led them to suggest that the 
core protein, the apomucin, is produced by the same gene. The native forms of these 
molecules are distinct in size and degree of glycosylation, however, suggesting that factors 
other than the primary structure of the apomucin determine these characteristics. 

The novel human endolyn-like Proteins of the invention shares 76% homology to the 
rat Endolyn and to human Mucin CD164. Therefore it is anticipated that this novel protein 
has a role in the regulation of essentially all cellular functions and could be a potentially 
important target for drugs. Such drugs may have important therapeutic applications, such as 
treating numerous tumors. Ihrke et al., Biochem J 2000 Jan 15;345 Pt2:287-96; Karlsson, et 
aUAnn, Hum, Genet 47: 263-269, 1983; Gendler, et al., J. Biol Chem, 265: 15286-15293, 
1990; Lan, Met al., J. Biol. Chem. 265: 15294-15299, 1990. 

The protein similarity information, expression pattern, cellular localization, and map 

location for the NOV25 protein and nucleic acid disclosed herein suggest that this Endolyn- 

like protein may have important structural and/or physiological functions characteristic of the 
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Mucin family. Therefore, the nucleic acids and proteins of the invention are useful in 
potential diagnostic and therapeutic applications and as a research tool. These include serving 
as a specific or selective nucleic acid or protein diagnostic and/or prognostic marker, wherein 
the presence or amount of the nucleic acid or the protein are to be assessed. These also 
include potential therapeutic applications such as the following: (i) a protein therapeutic, (ii) a 
small molecule drug target, (iii) an antibody target (therapeutic, diagnostic, drug 
targeting/cytotoxic antibody), (iv) a nucleic acid useful in gene therapy (gene delivery/gene 
ablation), (v) an agent promoting tissue regeneration in vitro and in vivo, and (vi) a biological 
defense weapon. 

The NOV25 nucleic acids and proteins of the invention have applications in the 
diagnosis and/or treatment of various diseases and disorders. For example, the compositions 
of the present invention will have efficacy for the treatment of patients suffering from: 
diabetes. Von Hippel-Lindau (VHL) syndrome, pancreatitis, obesity, fertility, hypogonadism, 
systemic lupus erythematosus, autoimmune disease, asthma, emphysema, scleroderma, 
allergy, ARDS, psoriasis, actinic keratosis, tuberous sclerosis, acne, hair growth/loss, 
allopecia, pigmentation disorders, endocrine disorders, renal artery stenosis, interstitial 
nephritis, glomerulonephritis, polycystic kidney disease, renal tubular acidosis, IgA 
nephropathy, hypercalceimia, Lesch-Nyhan syndrome as well as other diseases, disorders and 
conditions. 

These materials are further useful in the generation of antibodies that bind 
immunospecifically to the novel substances of the invention for use in therapeutic or 
diagnostic methods. These antibodies may be generated according to methods known in the 
art, using prediction from hydrophobicity charts, as described in the "Anti-NOVX 
Antibodies'' section below. The disclosed NOV25 protein has multiple hydrophilic regions, 
each of which can be used as an immunogen. In one embodiment, a contemplated NOV25 
epitope is from about amino acids 25 to 35. In another embodiment, a contemplated NOV25 
epitope is from about amino acids 43 to 62. In other specific embodiments, contemplated 
NOV25 epitopes are from about amino acids 80 to 1 10, 125 to 150 and 182 to 187. 

NOV26 

A disclosed NOV26 (designated CuraGen Acc. No. CG57224-01), which encodes a 
novel Arylacetamide Deacetylase-like protein and includes the 2082 nucleotide sequence 
(SEQ ID NO:79) is shown in Table 26A. An open reading frame for the mature protein was 
identified beginning with an ATG initiation codon at nucleotides 499-501 and ending with a 
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TGA stop codon at nucleotides 1729-1731 . Putative untranslated regions are underlined in 
Table 26A, and the start and stop codons are in bold letters. 



Table 26A. NOV26 Nucleotide Sequence (SEQ ID NO:79) 

CAGCTTCX:CCATGGATCACTCTCCAAATAGATTCTTTACACACAGGTAATGTCACT 

CCTTGTCCCCCAGCCCCCGAGTGGTGCTCTTCGGGGGCCCTCATCCATTGGCAAGTGACTGTCTATTCACATCTC 

TCTTCCTGTTGTTGAGTGAGTGa^GGGAGGGAGCCTGCCGGGGATCCACAGCTCCCAG 

ACAGTGCTCTTGGCCCTGCATGTGCTGTCACGGCCATTTGGGGTCTATATCCTGTCTCTTAGAGGACAC^ 

AATCTCTCAAATTCAGGTTTCTCCTGTGTCCCTACCTGGTGCCCGGCCCGGGCTGTTTTTCTCTGTTTCAAATGC 

CAGGGCTACTTATGGACTCCTATTCTACCTGCAARACCCTACTTGAATGCTCCCTCAGTT 

CTGCTCCTTCCAGCCTCCCCACAACAACTACAGCACCACCACTATATAA TGGCTAAATCTGTTGAG^ 

TGGGCCAGACACTGTGCTGAGTACATGGATATGTTTTCTTCTTTAATCCTCACAACCCCTCGAGTCAGCCCC^ 

CTAGGCTACCCTTTGGCAAATTCACATCATTATTCAATCAAGAGCCTCTGGGGAGAAAAGTTGGAAAACCC^ 

CTCTACCTGGACACAGTCCAGAGCCTATGGATTCCTGAAGAGCCCCCTOTACCr 

AAAAAGGACCCTGAACTTGTGGTGACCGACCTGCGTTTTGGGACGATACCCGTGAGGCTGTTCCAGCCGAAGG^ 
GCATCCTCCAGACXrCCGGCGAGGCATCATCTTCTACCATGGAGGGGCCAC^ 

CATGGCCTGTGCAATTATCTGGCCCGGGAGACTGAATCTGTACTTCTGATGATTGGGTACCGCT^GCT 

CACCATTCCCCTGCCCTTTTCCT^GACTGCATGAATGCCrCCATTCACTTCCTGAAGGCCCT 

GTGGACCCCTCCAGGGTTGTGGTCTGTGGAGAAAGCGTCGGAGGTGCAGCGGTGGCCGCCATCACCCAGGCCTTG 

GTGGGCAGATCAGATCTTCCCCGGATCCGGGCTCAGGTTCrGATTTATCCAGTTGTCCAGGCA^ 

TCGCCATCCTTTCAGCAGAACCAAAAIOTCCCATTACTTTCCCGGAAGTTCATGGTGAC^ 

CTGGCCATTGACCrCTCCTGGCGTGACGCCATCITGAACGGC3VCrTGCGTACCCCC^ 

GAGAAGTGGCTCACCCCTGACAACATCCCCAAGAAATTTAAGAACACAGGCTACCAACCCTGGTCTCCCGGCCCT 

TTTAATGAAGCTGCCTATCTAGAAGCCAAACATATGCTGGATGTAGAAAATTCACCXrCTGATAGm 

GTCATCGCTCAGCITCCTGAGGCCTTCCTGGTGAGCrGTGAGAATGACATACTCCGTGATGACAGCTTGCTC 

AAGAAGCGCTTGGAGGACCAGGGGGTCCGCGTGACATGGTACCACCTGTATGATGGITTTC^ 

TTTTTTGATAAGAAGGCTCTCTCITTCCCATGTTCCCTGAAGATTGTGAATGCTGTAGTCAGTTATATA^ 

ATATG ATAGTAACCCTGGQGCCCGAGGAGGAAGGGGCAAGTATGGACTCTACCAGAAACCGGGTGCTTTAGTGAG 

TTCTATTTTATTGACTAAAGAGGTGCTACaiTCAATGCTTGQGGCAGCTGGGAAGGGTGAGAAGTAAGCT;^ 

CTTGCTTAGTATTCAAGAAAATCCAAACTGTGTCTGTTTCCTTCCAGCACTAACAATGTCCATTGCTGGATCT 

CGACATTCTCTAACATTCCCATTTAGGTGAAATAAATATCAAAAGGAGAAAAAAATGCCTTTAAAAATTTCTC^ 

AGCCCCAACATATAAGATCTGTGCAGAATAAATGCCAACAACTGGTCATACCGTCAA 



The disclosed NOV26 nucleic acid has 295 of 500 bases (59%) identical to a 
gb:GENBANK-lD:AB037784|acc:AB037784.1 mRNA from Homo sapiens (Homo sapiens 
mRNA for KIAA1363 protein, partial cds) (E = 23e-^^). 

A disclosed NOV26 polypeptide (SEQ ID NO:80) is 410 amino acid residues in 
length and is presented using the one-letter amino acid code in Table 26B. The SignalP, 
Psort and/or Hydropathy results predict that NOV26 does not have a signal peptide and is 
likely to be localized to the nucleus with a certainty of 0,8800. In alternative embodiments, a 
NOV26 polypeptide is located to the microbody (peroxisome) with a certainty of 0.2235, the 
lysosome (membrane) with a certainty of 0.1 734, or the mitochondrial matrix space with a 
certainty of 0.1000. 



Table 26B. Encoded NOV26 Protein Sequence (SEQ ID NO:80) 



MAKSVEQLPWARHCAEYMDMFSSLILTTPRVSPKLGYPIJ^SHHySIKSIiWGEKLENPALYIiDWQS 

PTGGS VRI KKDPELWTDLRFGTI PVRLFQPKAASSRPRRGI I FYHGGATVFGSLDCYHGLCNYLARETESVLLMI 

GYRKLPDHHSPALFQDCMNASIHFLKALETYGVDPSRVVVCGESVGGAAVAAITQALVGRSDLPRIRAQV^ 

QAFCLQSPSFQQNQNVPLLSRKFMVTSI*Cira:AIDI.SWRDAII*NGTCVPPDVWR^ 

WSPGPFNEAAYLEAKHMIJ)VENSPLIADDEVIAQLPEAFLVSCEiroiIJ®DSLIi^ 

GSIIFFDKKALSFPCSLKIVNAWSYIKGI 
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The NOV26 amino acid sequence was found to have 1 16 of 325 amino acid residues 
(35%) identical to, and 183 of 325 amino acid residues (56%) similar to, the 398 amino acid 
residue ptnr:TREMBLNEW-ACC:AAG60035 protein fix)m Mus musculus (Mouse) 
(ARYLACETAMIDE DEACETYLASE) (E = 5.4e-^^). 

NOV26 is expressed in at least the following tissues: Pooled human melanocyte, fetal 
heart, and pregnant uterus. Expression information was derived from the tissue sources of the 
sequences that were included in the derivation of the sequence of CuraGen Acc. No. 
CG57224-0L The sequence is predicted to be expressed in the brain because of the 
expression pattern of (GENBANK-ID: gb:GENBANK-ID:AB037784|acc:AB037784.1), a 
closely related Homo sapiens mRNA for KIAA1363 protein, partial cds homolog in species 
Homo sapiens. 

Homologies to the above NOV26 proteins will be shared by the other NOV26 
proteins insofar as they are homologous to each other. Any reference to NOV26 is assumed 
to refer to NOV26 proteins in general. 

NOV26 has homology to the amino acid sequences shown in the BLASTP data listed 
in Table 26C. 



Table 26C. BLAST results for NOV26 


Gene Index/ 
Identifier 


Protein/ Organism 


Length 
(aa) 


Identity 
(%> 


Positives 
(%) 


Baqpect 


gi| 17438979 1 ref 
|XP_060166.1 1 
(XM_060166) 


similar to 
ARYLACETAMIDE 
DEACETYLASE 
(AADAC) (H. 

sapiens) [Homo 
sapiens] 


407 


327/330 
(99%) 


328/330 
(99%) 


0.0 


gi [17438981 1 ref 
|XP_060167.1| 
(XM_060167) 


similar to 
arylacetamide 
deacetylase (H. 
s api ens ) [Homo 
sapiens] 


409 


185/388 
(47%) 


244/388 
(62%) 


2e-94 


gi| 7513557 |pir| 
|A58922 


esterase/N- 
deacetylase (EC 

3.5.1.-), 50K 
hepatic - rabbit 


398 


117/333 
(35%) 


179/333 
(53%) 


2e-46 


gi [4557227 1 ref 1 
NP_001077.l| 
(NM 001086) 


aryl acet amide 
deacetylase [Homo 
sapiens] 


399 


127/379 
(33%) 


200/379 
(52%) 


8e-46 


gi [ 10120490 [ref 
|NP 065413.1 I 
{£3M 020538) 


arylacetamide 
deacetylase 

[Rattus 
norvegicus] 


398 


113/330 
(34%) 


179/330 
(54%) 


8e-46 
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The homology of these sequences is shown graphically in the ClustalW analysis 
shown in Table 26D. 



Table 26D. ClustalW Analysis of NOV26 



X) NOV26 {SEQ ID NO: 80) 

2) gi 17438979 (SEQ ID NO:351> 

3) gi 17438981 (SEQ ID NO:352} 

4) gi 7513557 (SEQ ID NO:353) 

5) gi 4557227 (SEQ ID NO:354) 

6) gi|l0120490 (SEQ ID NO:355) 
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N0V26 

gi I 17438979 
gl I 17438981 
gi 1 7513557 I 
gi[ 4557227 I 
gi I 10120490 



NOV26 

gi 1 17438979 
gi 1 17438981 
gi I 7513557 j 
gi|4557227[ 
gi 1 10120490 



20 

I 



HOV26 


118 j 


flj 






gi| 17438979] 


115 1 








gij 174389811 


79 








gij 7513557 1 


106 j 




ji 




gij 4557227] 


107 1 




|i 




gij 10120490] 


106 1 




|i 





NOV26 

gi|17438979 
gi 117438981 
gij 7513557 | 
gij 4557227 I 
gij 10120490 



NOV26 


198 


gi 1 17438979 j 


195 


gi i 17438981 j 


197 


gij 7513557] 


190 


gi 14557227] 


191 


gij 10120490] 


190 


NOV26 


257 


gi] 17438979] 


254 


gi] 17438981 j 


256 


gi]7513557 ) 


250 


gi ] 4557227 j 


251 


gi j 10120490] 


250 



58 
58 
33 
53 
54 
53 



JPWAR HCAEYMDMFS SIJlTTPRVS 

:.PIFFLQ|F^^AgEHFLTT 
*"\?FJ ^ 




70 



80 



- -ifeklgic^pkfirflhd-sBriki 
IpaavdIdlp- -p-l: 
ItsfqeSppts 
:>sfkv§gsfdeSppts 
:>tvqlfmrfqvRppts 




130 



140 



150 



DCYHGLCN 

c:yh6i>cn 

lrppIgmdwrvgvle: 

DLLsM 

DLLsi 

DTLsl R' 




160 170 180 

|.... ]....]. 

149 
146 

RRRISEKIDRKFAGVEENl 138 

139 
140 
139 




370 



380 



390 
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400 



410 



JEAAYL 316 
IFfEAAYL 313 
|EAAYL 315 
is ELAR 308 
isELAK 309 
|LELAQ 3 08 

420 



NOV26 

gi I 174389791 
gi I 17438981 I 
gi 1 75135571 
gi [4557227 I 
gi 1 10120490 1 



NOV26 


277 


DGFHG 


gi| 17438979 1 


374 


DGFHG 


gij 174389811 


376 


DGFHG 


gij 7513557 | 


369 


DGFHG 


gi [4557227 | 


370 


DGFHG 


gij 101204901 


369 


DGFHG 




Table 126E lists the domain description from IX)MAIN analysis results against 
NOV26. This indicates that the NOV26 sequence has properties similar to those of other 
proteins known to contain these domains. 



Table 26E Domain Analysis of NOV26 

gnl |Pf am|pfam00l35, COesterase, Carboxylesterase . 

CD-Length = 532 residues, 22,2% aligned 
Score = 43.5 bits (101), Expect = 2e-05 
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NOV26: 104 LFQPKAASSRPRRGIIFY-HGGATWGS-IJ^CraGLCimARETESVLLMIGYR 

II + + 111 +111 I M nil -^^^ I II 

Sbjct: 109 VYTPKNRKPNSKLPVMWIHGGGFMFGSGLSLYD6E--SIJUiEGNVIWSINyRLGPLGF 166 

NOV26: 156 -KLPDHHSPALFQDCMiaASIHF-LKAIiE TYGVDPSRVWCGESVGGAAVAAIT 2 06 

II I + 11+ +1 It 1 + III III+I+ + 

Sbjct r 167 LSTGDDVLPG NYGLLDQRIALKWVQDNIAAFGGDPDSVTIFGESAGGASVSLLL 220 

NOV26: 207 QALVGR 212 (SEQ ID NO:356) 
+ + 

Sbjct; 221 LSPSSK 226 (SEQ ID NOr357) 



The deacetylation of monoacetyldapsone (MADDS) has been examined in liver 
microsomes and cytosol from male Sprague-Dawley rats. Golden Sjrian hamsters, and Swiss 
Albino mice. All three rodent species demonstrated greater MADDS deacetylation activity in 

1 0 liver microsomes than in liver cytosol. Further investigations were conducted in hamsters. 
The velocity of MADDS deacetylation in major organs in the hamster was greatest in the 
intestine, followed by the liver and kidney. The effect of pretreatment with common inducers 
on liver microsomal deacetylation activity was also examined in the hamster. Phenobarbital, 
100 mg/kg/day x 3 days, did not alter activity, while dexamethasone at the same dose reduced 

15 2-acetylaminofluorene (2-AFF), MADDS, and p-nitrophenyl acetate (NPA) hydrolysis by at 

least 50%. Due to a previous report that KI activated the deacetylation of an arylacetamide in 

vitro (Khanna et al., J Pharmacol Exp Ther 262: 1225-1 23 1 , 1 992), the effects of the halides 

KF, KCl, KBr and KI on MADDS hydrolysis in vitro were tested. Of the halides studied, 
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only KF altered MADDS hydrolysis, resulting in an almost complete inhibition of 
deacetylase activity at 50 mM (with the initial concentration of MADDS at 0.6 mM) with an 
IC50 = 0.16 mM. Comish-Bowden and Dixon plots indicated that the inhibition exerted by 
KF was non-competitive. The rank order of inhibitor potencies was constructed using 
phenylmethylsulfonyl fluoride (PMSF), bis(p-nitrophenyl)phosphate (BNPP), physostigmine, 
and KF with 2-AFF, MADDS, and NPA as substrates. Different rank order potencies were 
obtained for each of the substrates tested. The substrates 2-AFF, MADDS, and NPA did not 
act as competitive inhibitors on the hydrolysis rates of each other. Liver microsomal 
arylacetamide deacetylase activity was greater in male hamsters than in females with either 
MADDS or 2-AAF as substrates; however, hydrolysis of NPA was similar in both male and 
female hamsters. These data support the hypothesis that the enzyme which catalyzes the 
hydrolysis of MADDS differs from that catalyzing either 2-AAF or NPA hydrolysis. 

The relative ability of arylacetamide deacetylase enzyme systems of dog liver to carry 
out the deacetylation of the carcinogens, 4-acetylaminobipheny!, 2-acetylaminofluorene, and 
2-acetylaminaphthalene, was examined. The arylacetamides were incubated with unfortified 
dog liver microsomes, and enzyme activity (nmol arylamine/mg protein/hr) was estimated by 
colorimetric quantitation of the resulting arylamines. The dog liver enzyme system displayed 
characteristics similar to those described for the rodent liver enzyme system in that enzyme 
activity was greatest in liver tissue, was localized in the microsomal subcellular fraction, 
required no cofactors, and was inhibited by heat, sodium fluoride, and thiol reagents. In five 
replicate assays, the relative rates of deacetylation were about 10, 6, and 1 with 4- 
acetylaminobiphenyl (84.8 +/- 12.4), 2-acetylaminofluorene (52.5 +/- 5.1), and 2- 
acetylaminonaphthalene (8.8 +/- 3.3), respectively. As a canine urinary bladder carcinogen, 
4-acetylaminobiphenyl is considered more potent than 2-acetylaminofluroene, while 2- 
acetylaminonaphthalene is devoid of detectable carcinogenic activity, despite the fact that 2- 
aminoaphthalene is a well-established canine urinary bladder carcinogen. Removal of the 
acetyl group may be a requirement for urinary bladder carcinogenesis; accordingly, the 
present studies demonstrate the appearance of a direct relationship between dog liver 
deacetylase enzyme specificity and urinary bladder susceptibility to these carcinogenic 
arylacetamides. 

The protein similarity information, expression pattern, cellular localization, and map 
location for the NOV26 protein and nucleic acid disclosed herein suggest that this 
Arlyacetamide Deacetylase-like protein may have important structural and/or physiological 
functions characteristic of the Protease family. Therefore, the nucleic acids and proteins of 
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the invention are useful in potential diagnostic and therapeutic applications and as a research 
tool. These include serving as a specific or selective nucleic acid or protein diagnostic and/or 
prognostic marker, wherein the presence or amount of the nucleic acid or the protein are to be 
assessed. These also include potential therapeutic applications such as the following: (i) a 
protein therapeutic, (ii) a small molecule drug target, (iii) an antibody target (therapeutic, 
diagnostic, drug targeting/cytotoxic antibody), (iv) a nucleic acid useful in gene therapy (gene 
delivery/gene ablation), (v) an agent promoting tissue regeneration in vitro and in vivo, and 
(vi) a biological defense weapon. 

The NOV26 nucleic acids and proteins of the invention have applications in the 
diagnosis and/or treatment of various diseases and disorders. For example, the compositions 
of the present invention will have efficacy for the treatment of patients suffering from: Von 
Hippel-Lindau (VHL) syndrome , Alzheimer's disease, Stroke, Tuberous sclerosis, 
hypercalceimia, Parkinson's disease, Huntington's disease. Cerebral palsy. Epilepsy, Lesch- 
Nyhan syndrome. Multiple sclerosis. Ataxia-telangiectasia, Leukodystrophies, Behavioral 
disorders. Addiction, Anxiety, Pain, Neuroprotection as well as other diseases, disorders and 
conditions. 

These materials are further useful in the generation of antibodies that bind 
immunospecifically to the novel substances of the invention for use in therapeutic or 
diagnostic methods. These antibodies may be generated according to methods known in the 
art, using prediction from hydrophobicity charts, as described in the "Anti-NOVX 
Antibodies" section below. The disclosed NOV26 protein has multiple hydrophilic regions, 
each of which can be used as an immunogen. In one embodiment, a contemplated NOV26 
epitope is from about amino acids 5 to 10. In another embodiment, a contemplated NOV26 
epitope is from about amino acids 40 to 55. In other specific embodiments, contemplated 
NOV26 epitopes are from about amino acids 60 to 85, 105 to 120, 140 to 142, 155 to 162, 
240 to 252, 260 to 340 and 350 to 380. 

NOV27 

A disclosed NOV27 (designated CuraGen Acc. No. CG57288-01), which encodes a 
novel Olfactory Receptor-like protein and includes the 1008 nucleotide sequence (SEQ ID 
NO:81) is shown in Table 27A. An open reading frame for the mature protein was identified 
beginning with an OCA initiation codon at nucleotides 1-3 and ending with a TGA stop 
codon at nucleotides 922-924. Putative untranslated regions are underlined in Table 27A, 
and the start and stop codons are in bold letters. 
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Table 27A. NOV27 Nucleotide Sequence (SEQ ID NO:81) 

GCAGAGGAGCTCCTTGGATTTTCTTATCTCCATGAGTTCCTVGGTTCTGCTGTTTGCTCTGATCCr 

TGCTGATGCTOOTSGGCAACCTGGCCATCATCAGCTTCATTTGCCTTGATTCCCG 

CTTCCTCTGCAACTTCTCCCrCATGGAGATGCTGGTCACCTCCAC^^ 

TCCACrCACAAGACCATGTCCCTGGCCAAATGCCTAACCCAGTCrTTCTT^ 

TCCTGATACTCATGGTCATGGCCITTGATCGCTACGTGGCCATCKJ^ 

TGGTCCAGTGTGTGTGAAGCTGGTGGTGGCCTGTTGQGTGGTTGGTTTCCTCT 

AAAACACGACTCTGGTTCTGTGGCCCTAACATCATCGGCCACTACITCrGTG^ 

CCTGCTCTGACACCCGCCACaTTGAGCGCATGGACCTCTTCCTGTCCCTGCTCTTTGI^ 

TATCATCCTCTCCTACSVTCCTCATTGTGGCTGCAGTGCTGCACATCCCTTCCTCCTCTGGATGCC^ 

TCCACCTGTGCCCCTC».CCTCaCAGTGGTGGTTCTGGGCTATGGCAGTGCCATCTTC^^ 

CTTCACCTTCCGGAATGAGAAGGTOAGGAGGTCSiTTGAGGATGTGACTAAAAGG^ 

GCCTGTAGGTG AGAGGGTGAGCCCTTGACAQGGCrAGAGAGCACCTGACAAGTCACGAC^ 

GTGGGCACCCACATGCCTAA 



The disclosed NOV27 nucleic acid has 540 of 892 bases (60%) identical to a 
gb:GENBANK-ID:AP002533|acc:AP002533.1 mRNA from Homo sapiens (Homo sapiens 
genomic DNA, chromosome Iq22-q23, CDl region, section 2/4) (E = LSe""^^), 

The NOV27 polypeptide (SEQ ID NO:82) is 307 amino acid residues in length and is 
presented using the one-letter amino acid code in Table 27B. The SignalP, Psort and/or 
Hydropathy results predict that NOV27 has a signal peptide and is likely to be localized to 
the endoplasmic reticulum (membrane) with a certainty of 0.6850. In alternative 
embodiments, a NOV27 polypeptide is located to the plasma membrane with a certainty of 
0.6400, the Golgi body with a certainty of 0.4600, or the endoplasmic reticulum (lumen) with 
a certainty of 0.1000. The SignalP predicts a likely cleavage site for a NOV27 peptide 
between amino acid positions 34 and 35, i,e, at the sequence NLA-II. 



Table 27B. Encoded NOV27 Protein Sequence (SEQ ID NO:82) 

AEELI^PSYLHEFQVLLFALIIJliIYVLMLLGNLAIISFICLDSRIJISPMYFFL 
LLSllIKTMSIiAKCLTQSFFYFSI/SSANFLIIJyrVMAFDRY^^ 

PTIiQKTRLWFCGPNI IGHYFCDSAPLIiKLACSDTRHIERMDLFLSLLFVLTTMLLI ILSYILI VAAVL 

GCQKAFSTCAPHLTVVVLGYGSAIFIYTOPGKGHSTYIJSnCAVAMV^ 

RIFIiCa[>PAACR 



The NOV27 amino acid sequence was found to have 143 of 295 amino acid residues 
(48%) identical to, and 198 of 295 amino acid residues (67%) similar to, the 313 amino acid 
residue ptnr:SPTREMBL-ACC:Q9Zl VO protein from Mus musculus (Mouse) 
(OLFACTORY RECEPTOR C6) (E - l.le"^^. 

NOV27 is expressed in at least the following tissues: Apical microvilli of the retinal 
pigment epithelium, arterial (aortic), basal forebrain, brain, Burkitt lymphoma cell lines, 
corpus callosum, cardiac (atria and ventricle), caudate nucleus, CNS and peripheral tissue, 
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cerebellum, cerebral cortex, colon, cortical neurogenic cells, endothelial (coronary artery and 
umbilical vein) cells, palate epithelia, eye, neonatal eye, frontal cortex, fetal hematopoietic 
cells, heart, hippocampus, hypothalamus, leukocytes, liver, fetal liver, lung, limg lymphoma 
cell lines, fetal lymphoid tissue, adult lymphoid tissue, those that express MHC II and III 
nervous, medulla, subthalamic nucleus, ovary, pancreas, pituitary, placenta, pons, prostate, 
putamen, serum, skeletal muscle, small intestine, smooth muscle (coronary artery in aortic) 
spinal cord, spleen, stomach, taste receptor cells of the tongue, testis, thalamus, and thymus 
tissue* This information was derived by determining the tissue sources of the sequences that 
were included in the invention including but not limited to SeqCalling sources. Public EST 
sources. Literature sources, and/or RACE sources. 

Possible small nucleotide polymorphisms (SNPs) found for NOV27 are listed in 
Table 27C. 



Table 27C: SNPs 


Variant 


Nucleotide 
Position 


Base Change 


Amino Acid 
Position 


Base Change 


13377027 


620 


OA 


207 


Pro>His 



Homologies to any of the above NOV27 proteins will be shared by the other NOV27 
proteins insofar as they are homologous to each other as shown above. Any reference to 
NOV27 is assumed to refer to both of the NOV27 proteins in general, unless otherwise noted. 

NOV27 also has homology to the amino acid sequences shown in the BLASTP data 
listed in Table 27D. 



Table 27D. BLAST results for NOV27 


Gene Index/ 
Xdent;x£ler 


Protein/ Organism 


Length 
(aa) 


Identity 
(%> 


Positives 
{%) 


Expect 


gi 1 15723374 | ref |N 
P_277054.l| 
{NM 033519) 


olfactory receptor 
sdolf [Homo 
sapiens] 


280 


279/280 
(99%) 


279/280 
(99%) 


e-134 


gi i 15293 799 |gb[AA 
K95092.l| 
(AF399607) 


olfactory receptor 
[Homo sapiens] 


216 


215/216 
(99%) 


215/216 
(99%) 


2e-98 


gi 1 17476501] ref [X 
P_a63251.l| 
<XM_063251) 


similar to 
OLFACTORY 
RECEPTOR-LIKE 
PROTEIN F6 (H. 
sapiens) [Homo 
sapiens] 


1056 


145/295 
(49%) 


210/295 
(71%) 


4e-80 


gi [17464943 | ref |X 
P_069610.l| 
(XM_069610) 


similar to 
olfactory receptor 
sdolf (H. sapiens) 
[Homo sapiens] 


313 


155/295 
(52%) 


210/295 
(70%) 


3e-74 
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gi 1 17476599 IrefjX 


similar to 


347 


149/295 


207/295 


3e-64 


P_063285.l| 


olfactory receptor 




(50%) 


(69%) 




(XM_063285) 


sdolf (H. sapiens) 
[Homo sapiens] 











The homology of these sequences is shown graphically in the ClustalW analysis 
shown in Table 27E. 



Table 27E. ClustalW Analysis of NOV27 



1) NOV27 (SEQ ID NO: 82) 

2) gi 1 15723374 (SEQ ID NO:358) 

3) gi 1 15293799 (SEQ ID NO:359) 

4) gij 17476501 (SEQ ID NO:360) 

5) gij 17464943 (SEQ ID NO: 361) 

6) gij 17476599 (SEQ ID NO:362) 



ni 



rij 



10 20 30 40 50 60 

1 1 

1 X 

1 1 

1 MPVULPVHFSAKCPULLLCDPANPPSEPLPSQGCFIFIHRVLLDLSTAGBSGNTAGFICD 60 

1 1 

1 1 

70 80 90 100 110 120 
....|....|....|....|....t....|....|....|...,|,...|....|....| 
1 ■ 1 

X 1 

1 1 

61 QALIiTSPVRBDGAENGLGFHQPVELHICGDAVGFVGMGQRRKPMSVPWSHPKISEKCASD 120 

1 1 

1 1 

130 140 150 160 170 180 

1 1 

1 1 

1 1 

121 TWCTDATYHREHSKPSGPWEHGPLKPFEDWVPALPYPLWPQBLLHCGSQSGDCWCLLLLE 180 

1 1 

1 1 

190 200 210 220 230 240 

1 1 

1 - 1 

1 1 

181 SSRRSPPTLPI PLTFPRLCQSFPLLTASGKEPSCXSFTSALRRLYGCGAAERPQS PVTPKT 240 

1 1 

1 1 

250 260 270 280 290 300 

1 - - 1 

1 --- 1 

1 - 1 

241 ETSEQGPKDPPIHUmPSDRALSPSCFI^LRAVILTCKNRDAQVEEGHRREPPVLDCGYQ 300 

1 1 

1 1 

310 320 330 340 350 360 

1 1 

1 1 
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NOV27 

gi 1 15723374 
gij 15293799 
gij 17476501 
gi 1X7464943 
gi I 17476599 



NOV27 

gi 1 15723374 I 
gij 15293799 I 
gij 17476501 j 
gi j 174 64 943 j 
gi i 17476599 j 



NOV27 

gi I 15723374 
gij 15293799 
gi I 17476501 
gij 17464943 
gij 17476599 



NOV27 

gi 1 15723374 I 
gij 15293799 I 
gij 17476501 j 
gi I 17464943 j 
gij 17476599 1 



N0V27 



gi 
gi 
gi 
g± 



15723374} 
15293799] 
174765011 
17464943 j 
174765991 



N0V27 

gi 1 15723374 1 



gi|15293799| 1 — — 1 

gi 1 17476501 j 301 RSGTRC»3HTRRICSTIJlGSRIEAWVAAATLQR6PyFRKQQPI^KDSWSVAEDV«EAFM^ 360 

gi|l7464943| 1 1 

gi|l7476599| 1 — 1 

370 380 390 400 410 420 

NOV27 1 1 

gi 1 15723374 1 1 1 

gi 115293799] 1 1 

gi 1 17476501 1 361 FGWVLWDASMAI»EAQRDPSSNDTKGKDQLTKRDQRNPQNFi^QKSAASDWNSQ^^ 420 

gi 117464943 I 1 1 

gi I 174765991 1 1 



430 440 450 460 470 480 

....|....|.,..|....l....l....|....|.---|--..l-...|....|-..-| 

NOV27 1 1 

gi I 157233741 1 1 

gi 1152937991 1 1 

gi|l7476501| 421 GYIiTa\SASIiGEISSPHFPVHI*MAPKCHWGI.SSSPVERWmiRERKAVTDBSSSSWM^ 4 80 

gi 1 17464943 I 1 1 

gi|17476599| 1 i 



NOV27 

gi 115723374 
gi I 15293799 
gi I 17476501 
gi I 17464943 
gi 1 17476599 



490 500 510 520 530 540 

....|....l....|....l....h...l....|.-..l....l.. I 

1 1 

1 1 

1 1 

481 ARETPGILAQRICSALKGVWCQAAQGSLPRLLSSLSISTGCDKTAVLTFDRALLTREHSK 540 

1 --- 1 

1 1 



550 560 570 580 590 600 

l....i....l....l..-.l-...l....|....l. 1...-I 

NOV27 1 1 

gi 1 15723374 [ 1 1 

gi 115293799 1 1 1 

gi|l747650lj 541 PNGPWERGPLKPSGDWDTCLHYLLWPQELFHCRSQTEDYTVTWFDWDRQMQKYSQSPFL 600 

gi 117464943 1 1 1 

giil7476599| 1 1 



610 



NOV27 

gi 1 15723374 I 
gi 1 15293799 I 
gi 1 17476501 1 
gi 117464943 I 
gi 117476599 I 



1 
1 
1 

601 

1 
1 



-AE] 



630 640 
[YLHEFQVLBEALiBLli 



650 



660 




eqrvkktmspdgnhssdpte: 

MA nlsqpse: 

MG — nwtaavte; 




.PNLNSARVE 
IFSSFGELQAL 
FSLSREVELL 




ICL 41 

id, 14 

1 

660 

[lA 49 
,S 50 



NOV27 42 

gi 1 15723374 | 15 

gi 1 15293799 I 1 

gi 1 17476501] 661 

gi 1 17464943 I 50 

gi 1 17476599 1 51 



670 



680 



690 



720 




N0V27 


102 


gi 115723374 


75 


gi 115293799 


45 


gij 17476501 


721 


gi| 17464943 


110 


gij 17476599 


111 





gi I 15293799 j 
gi I 17476501 j 
gi 117464943 I 
gi 1 17476599 1 



NOV27 

gi 1 15723374 1 
gi I 152937991 
gi|17476501j 
gi I 174649431 
gi 1 17476599 1 



N0V27 

gi 1 15723374 I 
gi 1 15293799 I 
gi 1 17476501 1 
gi 1 17464943 I 
gi|l7476599| 



NOV27 

gi [157233741 
gi 1 15293799 1 
giil747650l| 

1020 

gi [17464943 I 
gi [17476599! 





IPS 


163 


M 


IPS 


840 




IPS 


228 






229 




910 



920 



930 



940 



950 



960 




t 



-.--iFLGDPiACR 307 

fFLGDp|ACR 280 

216 

.CNCRKGSLTTTTKSATLRCGAGAKARAGARL 960 
> 313 



iCDFAFER< 

--Ikglc] 



-iRGVPEKRMRAVUlSRLSSNKDHQGRACSSPPCVy SVKL 345 



980 



990 



1000 



1010 



1020 

307 

280 

216 



961 HPAAGSPRDSRKVNVRVQKDPRRSVPKVETFISGSGPSCVGQCTGRVCILKGTRTISGOE. 

313 --- 

346 QC 



313 
347 



1040 



NOV27 


307 


gi 


15723374 


280 


gi 


15293799 


216 


gi 


17476501 


1021 


gi 


17464943 


313 


gi 


17476599 


347 



307 

280 

216 

1056 

313 

347 



Table 27F lists the domain description from DOMAIN analysis results against 
NOV27. This indicates that the NOV27 sequence has properties similar to those of other 
proteins known to contain the 7 transmembrane receptor domain. 
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Table 2TF Domain Analysis of NOV27 

gnl I Pf am I pf amO 0001, 7 tm_l , 7 transmembrane receptor ( rhodops in f ami ly ) . 
C3>-Length = 254 residues, 98.4% aligned 
Score 73.2 bits (178), Expect = 2e-14 



NOV27 : 


35 


I ISFICIJ>SRIiHSPMyFFLCNFSIJy!EMVVTSTVVHRMI^ 


94 






-^11 +1+1 II I +++ 1+ 1 1+ II 




Sbjct: 


5 


VILVILRTKKLRTPTNIFIiLNIiAVADLLFLLTLPPWi^YYLVGGDWVFGDi^C^ 


64 


NOV27 : 


95 


FSLGSANFLILMVMATORYVAICHPLRYPTITNGPVCV^ 


154 






1 h IH llkll mil 1 1+-^ Ik 1 + 1 1 




Sbjct: 


65 


VVNGYASILLLTAISIDRYLAIVHPLRYRRIRTPRRAKVLILLVWVLA^ 


124 


NOV27: 


155 


RLWFCGPNIIGHYFOTSAPLLKlJVCSDTRHIERMDLFI^IiFVLTTMIiIiIIIiSYILIVA^ 
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1 +1 + + 1 1 ++ 1 1 +1 1 




Sbjct: 


125 


LRTVEEGNTTVCLIDFPEESVKRSYVLLSTLVGFVLPLLVILVCYTRILRTLRKRARSQR 


184 


NOV27: 


215 


VLHIPSSSGCQKAFSTCAPHLTVWLGYGSAIFIYVRP GKGHSTYI*NKAVAMVTAM 


270 






i III + 1 +1+ 1 + + ^ 1 




Sbjct: 


185 


SLKRRSSSERKAAKMIJJVVVWFVLCWLPYHIVIJ^DSI.CLLSIWRVLPTi^ 


244 


NOV27 : 


271 


VTPFLNPFIF 280 (SEQ ID NO: 3 63) 




Sbjct: 


245 


1 III 1+ 

VNSCLNPIIY 254 (SEQ ID NO:364) 





G-Protein Coupled Receptor (GPCRs) have been identified as an extremely large 
family of protein receptors in a number of species. At the phylogenetic level they can be 
classified into four major subfamilies. These receptors share a seven transmembrane domain 
structure with many neurotransmitter and hormone receptors. They are likely to be involved 
in the recognition and transduction of various signals mediated by G-Proteins, hence their 
name G-Protein Coupled Receptors. The human GPCR genes are generally intron-less and 
belong to four gene subfamilies^ displaying great sequence variability. These genes are 
dominantly expressed in olfactory epithelium. 

Olfactory receptors (ORs) have been identified as extremely large family of GPCRs in 
a number of species. As members of the GPCR family, these receptors share a seven 
transmembrane domain structure with many neurotransmitter and hormone receptors, and are 
likely to underlie the recognition and G-protein-mediated transduction of odorant signals. 
Like GPCRs, the ORs they can be expressed in a variety of tissues where they are thought to 
be involved in recognition and transmission of a variety of signals. The human OR genes are 
typically intron-less and belong to four different gene subfamilies, displaying great sequence 
variability. These genes are dominantly expressed in olfactory epithelium. 

The protein similarity information, expression pattern, cellular localization, and map 
location for the NOV27 protein and nucleic acid disclosed herein suggest that this Olfactory 
Receptor-like protein may have important structural and/or physiological functions 

characteristic of the Olfactory Receptor family. Therefore, the nucleic acids and proteins of 
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the invention are useful in potential diagnostic and therapeutic applications and as a research 

tool. These include serving as a specific or selective nucleic acid or protein diagnostic and/or 

prognostic marker, wherein the presence or amount of the nucleic acid or the protein are to be 

assessed. TTiese also include potential therapeutic applications such as the following: (i) a 

protein therapeutic, (ii) a small molecule drug target, (iii) an antibody target (therapeutic, 

diagnostic, drug targeting/cytotoxic antibody), (iv) a nucleic acid useful in gene therapy (gene 

delivery/gene ablation), (v) an agent promoting tissue regeneration in vitro and in vivo, and 

(vi) a biological defense weapon. 

The NOV27 nucleic acids and proteins of the invention are useful in potential 

diagnostic and therapeutic applications implicated in various diseases and disorders described 

below and/or other pathologies. For example, the compositions of the present invention will 

have efficacy for treatment of patients suffering from: developmental diseases, MHCII and 

III diseases (immune diseases). Taste and scent detectability Disorders, Burkitt's lymphoma, 

Corticoneurogenic disease. Signal Transduction pathway disorders. Retinal diseases 

including those involving photoreception. Cell Growth rate disorders; Cell Shape disorders. 

Feeding disorders; control of feeding; potential obesity due to over-eating; potential disorders 

due to starvation (lack of appetite), non-insulin-dependent diabetes mellitus (NIDDMl), 

bacterial, fungal, protozoal and viral infections (particularly infections caused by HIV-1 or 

HIV-2), pain, cancer (including but not limited to Neoplasm; adenocarcinoma; lymphoma; 

prostate cancer; uterus cancer), anorexia, bulimia, asthma, Paricinson*s disease, acute heart 

failure, hypotension, hypertension, urinary retention, osteoporosis, Crohn's disease; multiple 

sclerosis; and Treatment of Albright Hereditary Ostoeodystrophy, angina pectoris, 

myocardial infarction, ulcers, asthma, allergies, benign prostatic hypertrophy, and psychotic 

and neurological disorders, including anxiety, schizophrenia, manic depression, delirium, 

dementia, severe mental retardation. Dentatorubro-pallidoluysian atrophy(DRPLA) 

Hypophosphatemic rickets, autosomal dominant (2) Acrocallosal syndrome and dyskinesias, 

such as Huntington's disease or Gilles de la Tourette syndrome and/or other pathologies and 

disorders of the like. The polypeptides can be used as immunogens to produce antibodies 

specific for the invention, and as vaccines. They can also be used to screen for potential 

agonist and antagonist compounds. For example, a cDNA encoding the OR -like protein may 

be useful in gene therapy, and the OR-like protein may be useful when administered to a 

subject in need thereof. By way of nonlimiting example, the compositions of the present 

invention will have efficacy for treatment of patients suffering from bacterial, fungal, 

protozoal and viral infections (particularly infections caused by HIV- 1 or HIV-2), pain, 
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cancer (including but not limited to Neoplasm; adenocarcinoma; lymphoma; prostate cancer; 
uterus cancer), anorexia, bulimia, asthma, Parkinson's disease, acute heart failure, 
hypotension, hypertension, urinary retention, osteoporosis, Crohn's disease; multiple 
sclerosis; and Treatment of Albright Hereditary Ostoeodystrophy, angina pectoris, 
myocardial infarction, ulcers, asthma, allergies, benign prostatic hypertrophy, and psychotic 
and neurological disorders, including anxiety, schizophrenia, manic depression, delirium, 
dementia, severe mental retardation and dyskinesias, such as Huntington's disease or Gilles 
de la Tourette syndrome and/or other pathologies and disorders. The novel nucleic acid 
encoding OR-like protein, and the OR-like protein of the invention, or fragments thereof, 
may further be useful in diagnostic applications, wherein the presence or amount of the 
nucleic acid or the protein are to be assessed. 

These materials are further useful in the generation of antibodies that bind 
immunospecifically to the novel substances of the invention for use in therapeutic or 
diagnostic methods. These antibodies may be generated according to methods known in the 
art, using prediction from hydrophobicity charts, as described in the "Anti-NOVX 
Antibodies" section below. The disclosed NOV27 protein has multiple hydrophilic regions, 
each of which can be used as an immunogen. In one embodiment, a contemplated NOV27 
epitope is from about amino acids 45 to 55. In another embodiment, a contemplated NOV27 
epitope is from about amino acids 75 to 95. In other specific embodiments, contemplated 
NOV27 epitopes are from about amino acids 1 10 to 140, 150 to 180, 210 to 240, 250 to 265 
and 270 to 295. 

NOV28 

A disclosed NOV28 (designated CuraGen Acc. No. CG57213-01), which encodes a 
novel PB39-like protein and includes the 2233 nucleotide sequence (SEQ ID NO:83) is 
shown in Table 28A. An open reading frame for the mature protein was identified beginning 
with an ATG initiation codon at nucleotides 77-79 and ending with a TAG stop codon at 
nucleotides 1661-1663. Putative untranslated regions are underlined in Table 28A, and the 
start and stop codons are in bold letters. 

Table 28A. NOV28 Nucleotide Sequence (SEQ ID NO:83) 

CCGGGGCTGGAGGGGGGCAAGCGGGTTCCGAGGTGCAAAGCCTGGTGCCCCGAGCCCTGCGGAGCTCGGGGCCA~ 

GCATGGCCCCCACGCTGCAACAGGCGTACCGGAGGCGCTGGTGGATGGCCTGCACGGCTGTGCTGGAGAACCTC 

TTCTTCTCTGCTGTACTCCTGGGCTGGGGCTCCCTGTTGATCATTCTGAAGAACGAGGGCTTCTATTCCAGCAC 

GTGCCCAGCTGAGAGCAGCACCAACACCACCCAGGATGAGCAGCGCAGGTGGCCAGGCTGTGACCAGCAGGACG 

AGATGCTCAACCTGGGCTTCACCATTGGTTCCTTCGTGCTCAGCGCCACCACCCTGCCACTGGGGATCCTCATG 

GACCQCTTTGGCCCCCGACCCGTGCGGCTGGTTGGCAGTGCCTGCTTCACrGCGTCCTGCACCCTCATGGCCCT 



217 



GGCCTCCCGGGACGTGGAAGCTCTGTCTCCGTTGATATTCCTGGCGCTGTCCCTGAATGGCTTTGGTGGCATCT 

GCCTAACGTTCACTTCACrCAAGCTGATCTACGATGCCGGTGTGGCCTTCGTGGTCATCATGlTCACCT 

GGCCTGGCCTGCCTTATCTTTCTGAACTGCACCCTCAACTGGCCCATCGAAGCCTTTCCTGCCCCTGAGGAAGT 

CAATTACACGAAGAAGATCAAGCTGAGTGGGCTGGCCCTGGACCACAAGGTGACAGGTGACCTCTTCTACACCC 

ATGTGACCACCATGGGCCAGAGGCTCAGCCAGAAGGCCCCCAGCCTGGAGGACGGTTCGGATGCCTTCATGTCA 

CCCCAGGATGTTCGGGGCACCTCAGAAAACCITrCCTGAGAGGTCTGTCCCCTTACGCAAGAGCCTCT 

CACrTTCCTGTGGAGCCTCCTCACCATGTGCATGACCCAGCTGCGGATCATCTTCTACATGGCTGCTGTGAACA 

AGATGCTGGAGTACCTTGTGACTGGTGGCCAGGAGCATGAGACAAATGAACAGCAACAAAAGGTGGCAGAGACA 

GTTGGGTTCTACTCCTCCGTCTTCGGGGCCATGCAGCTGTTGTGCCTTCTCACCTGCCCCCTCATTGGCTACAT 

CATGGACrrGGCGGATCAAGGACTGCGTGGACGCCCCAACTCAGGGCACTGTCCTCGGAGATGCCAGG^ 

TTGCTACCAAATCCATCAGACCACGCTACTGCAAGATCCAAAAGCTCACCAATGCCATCAGTGCCTTCACCOT 

ACCAACCTGCTGCTTGTGGGTTTTGGCATCACCTGTCTCATCAACAACTTACACCTCCAGTTTGTGACCTTTGT 

CCTGCACACCATTGTTCGAGGTTTCTTCCACTCAGCCTGTGGGAGTCTCTATGCTGCAGTGTTCCCATCCAACC 

ACTTTGGGACGCTGACT^GGCCTGCAGTCCCTCATCAGTGCTGTGTTCGCCTTGCTTCAGCAGCCACTTTTCATG 

GCGATGGTGGGACCCCTGAAAGGAGAGCCCTTCTGGGTGAATCTGGGCCTCCTGCTATTCTCACTCCTGGGATT 

CCTGTTGCCTTCCTACCTCTTCTATTACCGTGCCCGGCTCCAGCAGGAGTACGCCGCCAATGGGATGGGCCCAC 

TGAAGGTGCTTAGCGGCTCTGAGGTGACCGCATA GACTTCTCAGACCAAGGGACCTGGATGACAGGCAATCAAG 

GCCTGAGCAACCAAAAGGAGTGCCCCATATGGCTTTTCTACCTGTAACATGCACATAGAGCCATGGCCGTAGAT 

TTATAAATACCAAGAGAAGTTCTATTTTTGTAAAGACTGCAAAAAGGAGGAAAAAAAACCTTCT^AAAACG^ 

CTAAGTCAACGCTCCATTGACTGAAGACAGTCCCTATCCTAGAGGGGTTGAGCTTTCTTCCTCCTTGGGTTGGA 

GGAGACCAGGGTGCCTCTTATCTCCTTCTAGCGGTCTGCCTCCTGGTACCTCTTGGGGGGATCGGCAAACAGGC 

TACCCCTGAGGTCCCATGTGCCATGAGTGTGCACAACATGCAATGTGTCTGTGTATGTGTGAATGTGAGAAAAA 

CACAGCCCTCCTTTCAGAAGGAAAGGGGCCTGAGGTGCCAGCTGTGTCCTGGGTTAGGGGTTGGGGGTCGGCCC 

eiTCCAGGGCCAGGAAGGCAGGTTCCCTCTCTGGTGCTGCTGCTTGCAAGTCTTAGAGGAAATAAAAA 

TGAGAAAAAAAAA 



The disclosed NOV28 nucleic acid has been mapped to chromosome 1 Ipl 1 .2-pl 1.1 
and has 1 866 of 1 993 bases (93%) identical to a gb:GENBANK- 

ID:AF045584|acc:AF045584.1 mRNA from Homo sapiens (Homo sapiens PB39 mRNA, 
complete cds) (E = 0.0). 

The NOV28 polypeptide (SEQ ID NO:84) is 528 amino acid residues in length and is 
presented using the one-letter amino acid code in Table 28B. The SignalP, Psort and/or 
Hydropathy results predict that NOV28 has a signal peptide and is likely to be localized to 
the mitochondrial inner membrane with a certainty of 0.6450. In alternative embodiments, a 
NOV28 polypeptide is located to the plasma membrane with a certainty of 0.6000, the 
mitochondrial intermembrane space with a certainty of 0.5634, or the mitochondrial matrix 
space with a certainty of 0.4367. The SignalP predicts a likely cleavage site for a NOV28 
peptide between amino acid positions 44 and 45, Le, at the sequence NEG-FY. 



Table 28B. Encoded NOV28 Protein Sequence (SEQ ID NO:84) 

M^lT^QQAYRRRmiSAC^ 

LGFTIGSFVLSATTLPLGILMDRFGPRPVRLVGSACFTASCTLMAIJ^RDVRALSPLIFLALSI^GFC^ 

KLI YDAGVAFWIMFTWSGIACLI FLNCTLNWP I EAFPAPEEVNYTKKI 3CLSGLALDHKVTGDLFYTHVTTMGQRLS 

QKAPSLEDGSDAFMSPQDVRGTSENLPERSVPLRKSLCSPTFLWSLLTMCMTQLRI I FYMAAVNKMLEYLVTGGQEH 

ETNEQQQKVAETVGFYSSVFGAMQLLCLLTCPLIGYIMDWRIKDCVDAPTQGTVIX;DARIX3VATKSIRPR^ 

tnaisaftltnlllvgfgitclinnlhlqfvtfvlhtivrgffhsacgslyaavfpsnhfgtltglqslisavf^^ 
qqplfmamvgplkgepfwvnlglllfsllgfllpsylfyyrarlqqeyaangmgplkvlsgsevta 



The NOV28 amino acid sequence was found to have 384 of 419 amino acid residues 
(91%) identical to, and 391 of 419 amino acid residues (93%) similar to, the 559 amino acid 
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residue ptnr:SPTREMBL-ACC:075387 protein from Homo sapiens (Human) (PB39) (E = 
9.3e-^^. 

NOV28 is expressed in at least the following tissues: adrenal gland, bone marrow, 
brain - amygdala, brain - cerebellum, brain - hippocampus, brain - substantia nigra, brain - 
thalamus, brain -whole, fetal brain, fetal kidney, fetal liver, fetal lung, heart, kidney, 
lymphoma - Raji, mammary gland, pancreas, pituitary gland, placenta, prostate, salivary 
gland, skeletal muscle, small intestine, spinal cord, spleen, stomach, testis, thyroid, trachea, 
uterus. Liver, Lymphoid tissue. Tonsils, and Whole Organism. Expression information was 
derived from the tissue sources of the sequences that were included in the derivation of the 
sequence of NOV28. The sequence is predicted to be expressed in prostate epithelium 
because of the expression pattern of (GENBANK-ID: gbrGENBANK- 
ID:AF045584|acc:AF045584.1), a closely related Homo sapiens PB39 mRNA, complete cds 
homolog. 

Possible small nucleotide polymorphisms (SNPs) found for NOV28 are listed in 
Tables 28C and 28D. 



Table 28C: SNPs 


Consensus Position 


Depth 


Base Change 


PAF 


22 


8 


OA 


0.250 


408 


4 


G>T 


0,500 


418 


4 


G>T 


0.500 


427 


4 


G>T 


0.500 


454 


4 


A>T 


0.500 


455 


4 


G>C 


0.500 


458 


4 


G>C 


0.500 


495 


4 


G>C 


0.500 



Table 28D: SNPs 


Variant 


Nucleotide 
Position 


Base Change 


Amino Acid 
Position 


Base Change 


13377029 


1488 


'I>C 


471 


Val>Ala 



Homologies to any of the above NOV28 proteins will be shared by the other NOV28 
proteins insofar as they are homologous to each other as shown above. Any reference to 
NOV28 is assumed to refer to both of the NOV28 proteins in general, unless otherwise noted. 

NOV28 also has homology to the amino acid sequences shown in the BLASTP data 
listed in Table 28E. 
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Table 28E. BLAST results for NOV28 


Gene Index/ 
Identifier 


Protein/ Organism 


Length 
(aa) 


Identity 
(%) 


Positives 
(%) 


Expect 


gi 1 4505971 |ref[ 
NP_003618.l| 
(NM 003627) 


prostate cancer 
overexpressed gene 
1 [Homo sapiens] 


559 


527/559 
(94%) 


527/559 
(94%) 


0.0 


git 12847527 |dbj 
|BAB27605.1| 
{AK011417) 


data source :MGD, 

source 

key : MGI : 1931352, 
evidence : ISS-prost 
ate cancer 
overexpressed gene 
l-putative [Mus 
musculus] 


654 


426/552 
(77%) 


466/552 
(84%) 


0.0 


gi [15310953 |ref 
|XP_046257.2| 
(XM 046257) 


prostate cancer 
overexpressed gene 
1 [Homo sapiens] 


401 


377/392 
(96%) 


382/392 
(97%) 


0.0 


gi| 180273 88 |gb| 
AAIj55776.1|AF28 

9592_1 
(AF289592) 


unknown [Homo 
sapiens] 


489 


205/407 
(50%) 


263/407 
(64%) 


e-102 


gi|l8042965|gb| 
AAH19562.l|AAHl 
9562 (BC019562> 


Unknown (protein 
for IMAGE: 3451144) 
[Homo s ap i ens ] 


373 


198/359 
(55%) 


257/359 
(71%) 


6e-99 



The homology of these sequences is shown graphically in the ClustalW analysis 
shown in Table 28F. 



Table 28F. ClustalW Analysis of NOV28 



1) 

2) 
3) 
4) 
5) 
6) 



NOV28 



gi 
gi 
gi 
gi 



4505971 

12847527 

15310953 

18027388 

18042965 



(SEQ ID NO: 84) 
(SEQ ID NO: 365) 
(SEQ ID NO: 366) 
(SEQ ID NO:367) 
(SEQ ID NO:368) 
(SEQ ID NO:369) 



NOV28 


1 


gi 


4505971 1 


1 


gi 


12847527 


1 


gi 


15310953 


1 


gi 


18027388 


1 


gi 


18042965 


1 



N0V28 


1 


gi 


45059711 


1 


gi 


12847527 1 


61 


gi 


15310953 1 


1 


gi 


18027388 1 


1 


gi 


180429651 


1 



NOV28 

gi I 4505971 1 
gi 1 12847527 I 



10 



20 



30 
J.. 



40 



50 



60 



MPWLPGFTYLWRQDGSQIHCPFRGRRRGETGGSEARWVWHAGKTPRVDAIWNWDPGSQEI 



1 
1 

60 
1 
1 
1 



70 



80 



|....|. 



90 



100 



110 



120 



.|....|. 
■ -MAPTLQQA-! 

MAPTLQQA^ 

RSVEAPC^LCVTPGVKSCGRQVCRGQSLGHHGSHAEAGVPj 




130 




160 

..!.... I 

3DEQRR 
2DBQRR 
:?DEQHQ 



170 
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gi 1 15310953 I 
gi 118027388 I 
gi 1 18042965 1 



NOV28 

gi 1 4505971 1 
gi 1 128475271 
gi 1 15310953 I 
gi 118027388 I 
gi 1 18042965 I 



NOV28 

gi |4505971| 
gi 1 12847527 | 
gi 1 15310953 j 
gi 1 18027388 I 
gi 1 18042965 I 



NOV28 

gi 1 4505971 1 
gij 12847527 I 
gi I 153109531 
gij 18027388 I 
gij 18042965 I 



NOV28 

gi{ 4505971 | 
gi 1 12847527 j 
gi 1 15310953 j 
gi 1 18027388 I 
gi j 18042965 j 



NOV28 

gi 1 4505971 1 
gi j 12847527 
gij 15310953 
gij 18027388 
gij 18042965 



NOV28 

gi 1 4505971 [ 
gij 12847527 I 
gij 15310953 j 
gi j 18027388 j 
gi j 18042965 j 



NOV28 

gi [4505971 I 
gij 12847527 I 
gij 15310953 I 
gij 18027388 j 
gij 18042965 I 



NOV28 

gi I 4505971 1 
gi j 12847527 1 



1 1 

31 gi5B G l3WR^gs5!ggg5 Yi^spf!^ih^ 90 




6X0 

. J I . 



620 



630 



640 



650 
I I . . 



660 



455 
486 
581 



ISAV?ALLQQ?LFMAiWGPLiG|PF'.^r/NLGLLLgSLLGFLLPSYL||YYRfe 
ISAVFALLQQllLFMAMVGPLiGipFWVNLGLLLSsiaLGFLLPSYLi^^ 



3 514 
3 545 
640 
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gi 115310953 


328 


gi 


18027388 


457 


gi|lS042965 


295 


HOV28 


515 


gi 


4505971 1 


546 


gi 


12847527 


641 


gi 


15310953 


388 


gi 


18027388 


485 


gi 


18042965 


355 




670 




355 EDDKLPgKljGSftiQEAFV 373 



f1 



5 ■ 

Pi- 

ry 

B i 

3=S5 



The gene PB39 (HGMW-approved symbol POVl)^ whose expression is up-regulated 
in human prostate cancer, has been identified using tissue microdissection-based differential 
display analysis. The full-length sequence of PB39 cDNA, the genomic localization of the 
5 PB39 gene, and the genomic sequence of the mouse homologue have been reported. The full- 
length human cDNA is 2317 nucleotides in length and contains an open reading frame of 559 
amino acids which does not show homology with any reported human genes. The N-terminus 
contains charged amino acids and a helical loop pattern suggestive of an srp leader sequence 
for a secreted protein. Fluorescence in situ hybridization using PB39 cDNA as probe mapped 

1 0 the gene to chromosome 1 Ipl 1 .1-pl 1 .2. Comparison of PB39 cDNA sequence with murine 
sequence available in the public database identifies a region of previously sequenced mouse 
genomic DNA showing 67% amino acid sequence homology with human PB39. Based on 
alignment and comparison to the human cDNA the mouse genomic sequence suggests there 
are at least 14 exons in the mouse gene spread over approximately 100 kb of genomic 

15 sequence. Further analysis of PB39 expression in human tissues shows the presence of a 
unique splice variant mKNA that appears to be primarily associated with fetal tissues and 
tumors. Interestingly, the unique splice variant appears in prostatic intraepithelial neoplasia, a 
microscopic precursor lesion of prostate cancer. Comparison of expression levels in normal 
epithelium and invasive carcinoma, using beta-actin as an internal control, has shown the 

20 transcript to be substantially overexpressed in 5 of 1 0 carcinomas. The current data support 
the hypothesis that PB39 plays a role in the development of human prostate cancer and will 
be useful in the analysis of the gene product in further human and murine studies. 

The protein similarity information, expression pattern, cellular localization, and map 
location for the NOV28 protein and nucleic acid disclosed herein suggest that this PB39-Iike 

25 protein may have important structural and/or physiological functions characteristic of the 
transporters family. Therefore, the nucleic acids and proteins of the invention are useful in 
potential diagnostic and therapeutic applications and as a research tool. These include serving 
as a specific or selective nucleic acid or protein diagnostic and/or prognostic marker, wherein 
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the presence or amount of the nucleic acid or the protein are to be assessed. These also 
include potential therapeutic applications such as the following: (i) a protein therapeutic, (ii) a 
small molecule drug target, (iii) an antibody target (therapeutic, diagnostic, drug 
targeting/cytotoxic antibody), (iv) a nucleic acid useful in gene therapy (gene delivery/gene 
ablation), (v) an agent promoting tissue regeneration in vitro and in vivo, and (vi) a biological 
defense weapon. 

The NOV28 nucleic acids and proteins of the invention have applications in the 
diagnosis and/or treatment of various diseases and disorders. For example, the compositions 
of the present invention may have efficacy for the treatment of patients suffering from cancer, 
especially prostate cancer as well as other diseases, disorders and conditions. The expression 
of PB39 has been shown to be up-regulated in human prostate cancer and the current data 
support the hypothesis that PB39 plays a role in the development of prostate cancer and will 
be useful in the analysis of the gene product in further human and murine studies {Genomics 
1998 Jul 15;51(2):282-7). 

These materials are further useful in the generation of antibodies that bind 
immunospecifically to the novel substances of the invention for use in therapeutic or 
diagnostic methods. These antibodies may be generated according to methods known in the 
art, using prediction fh)m hydrophobicity charts, as described in the "Anti-NOVX 
Antibodies*' section below. The disclosed NOV28 protein has multiple hydrophilic regions, 
each of which can be used as an immunogen. In one embodiment, a contemplated NOV28 
epitope is from about amino acids 5 to 7. In another embodiment, a contemplated NOV28 
epitope is from about amino acids 70 to 80. In other specific embodiments, contemplated 
NOV28 epitopes are from about amino acids 200 to 215, 230 to 275, 312 to 310, 350 to 390 
and 495 to 510. 

NOV29 

A disclosed NOV29 (designated CuraGen Acc. No. CG56990-02), which encodes a 
novel Oxytocin-like protein and includes the 41 5 nucleotide sequence (SEQ ID NO:85) is 
shown in Table 29A. An open reading frame for the mature protein was identified beginning 
with an ATG initiation codon at nucleotides 1 8-20 and ending with a TGA stop codon at 
nucleotides 315-317. Putative untranslated regions are underlined in Table 29A, and the start 
and stop codons are in bold letters. 
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Table 29A. NOV29 Nncleotide Sequence (SEQ ID NO:85) 

CCaVGCGCACCCGCACCA TGGCCGGCCCCAGCCrCGCTTGCTGTOT^ 

CTGCTAC^TCCAGAACrraCCCCCTGGGAGGCAAGAGGGCCGCGCCGGAAQAGCTGGGCTGCT 
GCCGAAGaSCTGCGCTCCCaGGAGGAGAACrACCTGCCGTCGCCCTGCCA 
GCGGGGGCCGCTGCGCXSGTCTTGGGCCTCTGCTGCAGCCCGGACGGCTOCC^ 
GGAAGCCACCTTCTCCCa^GCGCTG AAACTTGATGGCTCCGAACS^CCCTC^ 

TAGCCAC<X:CAGAAATGGTGAAAATAAAATAAAGCAGGTTTTTCTCCTCT 

The disclosed NOV29 nucleic acid has been mapped to chromosome 20pl3 and has 
355 of 407 bases (87%) identical to a gb:GENBANK-ID:HUMOTCB|acc:M25650.1 mRNA 
from Homo sapiens (Human oxytocin mRNA, complete cds) (E = LSe"^^). 

A disclosed NOV29 polypeptide (SEQ ID NO:86) is 99 amino acid residues in length 
and is presented using the one-letter amino acid code in Table 29B. The SignalP, Psort 
and/or Hydropathy results predict that NOV29 has a signal peptide and is likely to be 
localized to the outside of the cell with a certainty of 0.8200. In alternative embodiments, a 
NOV29 polypeptide is located to the endoplasmic reticulum (membrane) with a certainty of 
0.1000, the endoplasmic reticulum (lumen) with a certainty of 0.1000, or the lysosome 
(lumen) with a certainty of 0.1000. The SignalP predicts a likely cleavage site for a NOV29 
peptide between amino acid positions 19 and 20, Le, at the sequence TSA-CY. 



Table 29B. Encoded NOV29 Protein Sequence (SEQ ID NO:86) 

MAGPSLACCLLGLIiAI*TSACYIQNCPIiGGKRAAPEELGCFVGTAEALRCQEENYI.PSP^^ 

LQLCCSPDGCHADPACDAEATFSQR 

The NOV29 amino acid sequence was found to have 65 of 65 amino acid residues 
(100%) identical to, and 65 of 65 amino acid residues (100%) similar to, the 125 amino acid 
residue ptnr:SWISSNEW-ACC:P01178 protein from Homo sapiens (Human) (OXYTOCIN- 
NEUROPHYSIN 1 PRECURSOR (OT-NPI) [CONTAINS: OXYTOCIN (OCYTOCIN); 
NEUROPHYSIN 1]) (E = l.9e^% 

NOV29 is expressed in at least the following tissues: adrenal gland, bone marrow, 
brain - amygdala, brain - cerebellum, brain - hippocampus, brain - substantia nigra, brain - 
thalamus, brain -whole, fetal brain, fetal kidney, fetal liver, fetal lung, heart, kidney, 
lymphoma - Raji, mammary gland, pancreas, pituitary gland, placenta, prostate, salivary 
gland, skeletal muscle, small intestine, spinal cord, spleen, stomach, testis, thyroid, trachea 
and uterus. Hypothalamus, and Whole Organism. Expression information was derived from 
the tissue sources of the sequences that were included in the derivation of the sequence of 
NOV29. The sequence is also predicted to be expressed in hypothalamus because of the 
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expression pattern of (GENBANK-ID: gb:GENBANK-ID:HUMOTCB|acc:M25650.1), a 
closely related Human oxytocin mRNA, complete cds homolog. 

NOV29 has h(Mnology to the amino acid sequences shown in the BLASTP data listed 
in Table 29C. 



Table 29C. BLAST results for NOV29 


Gene Index/ 
Tdentif ier 


Protein/ Organism 


Length 
(aa) 


Identity 
<%) 


Positives 
(%) 


Expect 


gi|4505537 |ref |NP 
_000906.1| 
(NM_000915) 


oxytocin- 
neurophysin I 
preproprotein ; 
oxytocin , prepro - 
(neurophysin I) 
[Homo sapiens] 


125 


99/125 
(79%) 


99/125 
(79%) 


5e-25 


ai 1 186991 Icxb IaAA9 
8806. l| (M11186) 


oxytocin— 
neurophysin I 
[Homo sapiens) 


124 


98/125 
(78%) 


98/125 
(78%) 


4e-23 


gi| 585553 |sp|P011 
77|NEU1_PIG 


Oxytocin- 
neurophysin 1 
precursor (OT-NPI) 
[Contains : 
Oxytocin 
(Ocytocin) ; 
Neurophysin 1] 


125 


87/125 
(69%) 


90/125 
(71%) 


5e-2i 


gi|l346683 |sp|P13 
3 89|NEU1_SHEEP 


OXYTOCIN- 
NEUROPHYSIN 1 
PRECURSOR (OT-KPI) 
[CONTAINS : 
OXYTOCIN 
(OCYTOCIN) ; 
NEUROPHYSIN 1] 


125 


87/124 
(70%) 


90/124 
(72%) 


2e-20 


gi |X28068|sp|P011 
75|NEUl_BOVIN 


OXYTOCIN- 
NEUROPHYSIN 1 
PRECURSOR (OT-NPI) 
[CONTAINS : 
OXYTOCIN 
(OCYTOCIN) ; 
NEUROPHYSIN 1] 


125 


87/124 
(70%) 


89/124 
(71%) 


2e-20 



The homology of these sequences is shown graphically in the ClustalW analysis 
shown in Table 29D. 



Table 29D. ClustalW Analysis of NOV29 

1) NOV29 (SEQ ID NO:86) 

2) gi|4505537 (SEQ ID NO:370) 

3) gi [386991 (SEQ ID NO:371) 

4) gi|585553 (SEQ ID NO:372) 

5) gi|l346683 (SEQ ID NO:373) 

6) gi 1128068 (SEQ ID NO:374) 





10 20 30 40 50 60 


NOV29 1 
gi [4505537 1 1 
gi|38699li 1 


magpsu^ccllgllaltsacyiqncplggkra^dldvr 5clpcgfggkgrcfgp|lccj 
magpsij^xcllglij\ltsacyiqncplggkra|5dldvh3clpcgpggkgrcfgp|icc! 


- 34 
^ 60 
^ 60 
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gi I 585553 I 
gi 1 1346683 I 
gij 128068 I 



N0V29 

gi 1 4505537 I 
gii38699l[ 
gij 585553 | 
gi j 1346683 I 
gij 128068 1 



NOV29 

gi I 4505537 | 
gij 386991 1 
gij 585553 I 
gij 1346683 I 
gi 1 128068 I 




Table 29E lists the domain description from DOMAIN analysis results against 
NOV29. This indicates that the NOV29 sequence has properties similar to those of other 
proteins known to contain these domains. 



Table 29E Domain Analysis of NOV29 

gnl |Pf am|pfam00184, hormoneS, Neurohypophysial hormones, C-terminal Domain. N- 
terminal Domain is in hormoneS 

CD- Length = 79 residues, 72.2% aligned 

Score = 62.4 bits (150), Expect le-11 

NOV29:35 EELGCFVGTAEALRCQEENYLPSPCQSGQKACGS-GGRCAVLGLCCSPDGCHADPAC 90 {SEQ ID NO: 375) 

llill^lli I ililillilllk^t I III Mil KM I III 

Sbjct:23 EELGCYVGTPETARCQEENYLPSPCEAGGKPCGSDAGRCAAPGVCCDSESCWDPEC 79 {SEQ ID NO: 376) 

gnl|Smart|smart00003, NH, Neurohypophysial hormones; Vasopressin/oxytocin gene 
family. 

CD- Length = 79 residues, 72.2% aligned 
Score = 60.1 bits (144), Escpect = 6e-ll 

NOV29: 35 EELGCFVGTAEJUiRCQEENYLPSPCQSGQKACGS-GGRCAVIiGLCCSPDGCHADPAC 90 (SEQ ID 

NO:377) 

lltlKIII I lllllllilllhll ^ llllllll kll + I Ilk! 

Sbjct: 23 EELGCYVGTPETARCQEENYLPSPCESGGRPCGSDGGRCAAPGICCDSESCAADPSC 79 (SEQ ID 
NO:378> 



Oxytocin (OT), a nonapeptide, was the first hormone to have its biological activities 
established and chemical structure determined. Oxytocin and vasopressin are structurally and 
functionally related neurohypophysial peptide hormones. Oxytocin mediates contraction of 
the smooth muscle of the uterus and mammary gland, while vasopressin has antidiuretic 
action on the kidney, and mediates vasoconstriction of the peripheral vessels. In common 
with most active peptides, both hormones are synthesised as larger protein precursors that are 
enigmatically converted to their mature forms. Members of this family are found in birds, 
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fish, reptiles and amphibians (mesotocin, isotocin, valitocin, glumitocin, aspargtocin, 
vasotocin, seritocin, asvatocin, phasvatocin), in worms (annetocin), octopi (cephalotocin), 
locust (locupressin or neuropeptide F1/F2) and in molluscs (conopressins G and S). 

It was believed that OT is released from hypothalamic nerve terminals of the posterior 
hypophysis into the circulation where it stimulates uterine contractions during parturition, and 
milk ejection during lactation. However, equivalent concentrations of OT were found in the 
male hypophysis, and similar stimuli of OT release were determined for both sexes, 
suggesting other physiological functions. Indeed, recent studies indicate that OT is involved 
in cognition, tolerance, adaptation and complex sexual and maternal behavior, as well as in 
the regulation of cardiovascular functions. It has long been known that OT induces natriuresis 
and causes a fall in mean arterial pressure, both after acute and chronic treatment, but the 
mechanism was not clear. The discovery of the natriuretic family shed new light on this 
matter. Atrial natriuretic peptide (ANP), a potent natriuretic and vasorelaxant hormone, 
originally isolated from rat atria, has been found at other sites, including the brain. Blood 
volume expansion causes ANP release that is believed to be important in the induction of 
natriuresis and diuresis, which in turn act to reduce the increase in blood volume. 
Neurohypophysectomy totally abolishes the ANP response to volume expansion. This 
indicates that one of the major hypophyseal peptides is responsible for ANP release. 

The role of ANP in OT-induced natriuresis has been evaluated, and it has been 
hypothesized that the cardio-renal effects of OT are mediated by the release of ANP from the 
heart. The presence and synthesis of OT receptors in all heart compartments and the 
vasculature has been demonstrated. The functionality of these receptors has been established 
by the ability of OT to induce ANP release from perfused heart or atrial slices. Furthermore, 
it has been shown that the heart and large vessels like the aorta and vena cava are sites of OT 
synthesis. Therefore, locally produced OT may have important regulatory functions within 
the heart and vascular beds. Such functions may include slowing down of the heart or the 
regulation of local vascular tone. 

The protein similarity information, expression pattern, cellular localization, and map 
location for the NOV29 protein and nucleic acid disclosed herein suggest that this oxytocin- 
like protein may have important structural and/or physiological functions characteristic of the 
neurohypophysial hormone family. Therefore, the nucleic acids and proteins of the invention 
are useful in potential diagnostic and therapeutic applications and as a research tool. These 
include serving as a specific or selective nucleic acid or protein diagnostic and/or prognostic 
marker, wherein the presence or amount of the nucleic acid or the protein are to be assessed. 
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These also include potential therapeutic applications such as the following: (i) a protein 
therapeutic, (ii) a small molecule drug target, (iii) an antibody target (therapeutic, diagnostic, 
drug targeting/cytotoxic antibody), (iv) a nucleic acid useful in gene therapy (gene 
delivery/gene ablation), (v) an agent promoting tissue regeneration in vitro and in vivo, and 
(vi) a biological defense weapon. 

The NOV29 nucleic acids and proteins of the invention have applications in the 
diagnosis and/or treatment of various diseases and disorders. For example, the compositions 
of the present invention may have efficacy for the treatment of patients suffering from 
reduced muscular tonus of the uterus, lactation problems, cardiovascular conditions, obesity 
as well as other diseases, disorders and conditions. It has been shown that there is inhibition 
by elevated circulating OT levels of glucocorticoid-induced, but not basal, leptin secretion in 
normal weight subjects, suggesting a possible role for OT in the regulatory control of leptin. 
Furthermore, the results obtained in obese subjects indicate that this regulation is disrupted in 
obesity (J Clin Endocrinol Metab 2000 Oct;85(10):3683-6). It has also been suggested that 
OT is involved in cognition, tolerance, adaptation and complex sexual and maternal behavior, 
as well as in the regulation of cardiovascular functions. Locally produced OT may have 
important regulatory functions within the heart and vascular beds. Such functions may 
include slowing down of the heart or the regulation of local vascular tone {BrazJMed Biol 
Res 2000 Jun;33(6):625-33). 

These materials are further useful in the generation of antibodies that bind 
immunospecifically to the novel substances of the invention for use in therapeutic or 
diagnostic methods. These antibodies may be generated according to methods known in the 
art, using prediction from hydrophobicity charts, as described in the "Anti-NOVX 
Antibodies" section below. The disclosed NOV29 protein has multiple hydrophilic regions, 
each of which can be used as an immunogen. In one embodiment, a contemplated NOV29 
epitope is from about amino acids 28 to 32. In another embodiment, a contemplated NOV29 
epitope is from about amino acids 36 to 37. In other specific embodiments, contemplated 
NOV29 epitopes are from about amino acids 38 to 39, 46 to 48, 49 to 62 and 88 to 91 . 
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NOV30 



One NOVX protein of the invention, referred to herein as NOVSO, includes three 
Thymosin Beta-4-like proteins. The disclosed proteins have been named NOVSOa, NOV30b 
andNOVSOc. 

NOV30a 

A disclosed NOV30a (designated CuraGen Acc. No. CG57330-01), which encodes a 
novel Thymosin Beta-4-Iike protein and includes the 201 nucleotide sequence (SEQ ID 
NO:87) is shown in Table 30A. An open reading frame for the mature protein was identified 
beginning with an ATG initiation codon at nucleotides 49-51 and ending with a TAA stop 
codon at nucleotides 199-201 . Putative untranslated regions are underlined in Table 30A, 
and the start and stop codons are in bold letters. 

Table 30A. NOV30a Nucleotide Sequence (SEQ ID NO:87) 

AGTGGGCATTGCTCAGCTTCCTCTGTGACTACGTCTGACAAGTCCAATA TGGATGAGATCGAGAAATTCAGTAAGT 
CGAAACTGAAGAAGACAGAAATGCAAGAGAAAAATCCACAGCCTTCCAAGGAATGGATCGAACAGGAGAAGCAAGC 
AGQCTTCGTAATGAQGCGTGCATCACCAATATGCACTAAGGGCGAATAA 

The disclosed NOV30a nucleic acid sequence maps to chromosome Xq2L3-22 and 
has 161 of 192 bases (83%) identical to a gb:GENBANK-ID:HUMTHYB4|acc:M17733.1 
mRNA from Homo sapiens (Human thymosin beta-4 mRNA, complete cds) (E = 1 .9e"^^). 

A disclosed NOV30a polypeptide (SEQ ID NO:88) is 50 amino acid residues in 
length and is presented using the one-letter amino acid code in Table 30B. The SignalP, 
Psort and/or Hydropathy results predict that NOV30a does not have a signal peptide and is 
likely to be localized to the nucleus with a certainty of 0.5800. In alternative embodiments, a 
NOV30a polypeptide is located to the microbody (peroxisome) with a certainty of 0.3000, the 
mitochondrial matrix space with a certainty of 0.1000, or the lysosome (lumen) with a 
certainty of 0.1 000. 



Table 30B^ Encoded NOV30a Protein Sequence (SEQ ID NO:88) 

MDEIEKFSKSKLKKTEMQEKNPQPSKEWIEQEKQAGFVMRRASPICTKGE 



The NOV30a amino acid sequence was found to have 31 of 36 amino acid residues 
(86%) identical to, and 31 of 36 amino acid residues (86%) similar to, the 50 amino acid 
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residue ptnr:SWISSPROT-ACC:P20065 protein from Mus musculus (Mouse) (THYMOSIN 
BETA-4)(E = 1.9e*^*^. 

NOV30a is expressed in at least the following tissues: spleen, thymus, lung, and 
macrophage. Expression information was derived from the tissue sources of the sequences 
that were included in the derivation of the sequence of NOVSOa. 

Possible small nucleotide polymorphisms (SNPs) found for NOV30a are listed in 
Table 30C. 



Table 30C: SNPs 


Ccaisensus Position 


Depth 


Base Change 


PAF 


16 


19 


G>T 


0.105 


32 


19 


OT 


0.105 


178 


19 


G>A 


0.105 



NOVSOb 

A disclosed NOV30b (designated CuraGen Acc, No. CG57330-03), which encodes a 
novel Beta Thymosin-like protein and includes the 246 nucleotide sequence (SEQ ID NO:89) 
is shown in Table SOD. An open reading frame for the mature protein was identified 
beginning with an ATG initiation codon at nucleotides 31-33 and ending with a TAG stop 
codon at nucleotides 229-23 L Putative untranslated regions are underlined in Table 30b, and 
the start and stop codons are in bold letters. 

Table 30D. NOV30b Nucleotide Sequence (SEQ ID NO:89) 

AGTGGGCATTGCTCAGCTTCCTCTGTGACTA TQTCTGjVCAAGTCCAATATGGATGAGA^ 

TCGAAACTGAAGAAGACAGAAATGCAAGAGAAAAATCCACAGCCTTCCAAGGTWVTGGATCGAACAG^ 

GCAGGCTTCGTAATGKAGGCGTGCATCX3CCAATATGCACTGTTCATTCCaCAAA6CATTTC 

TTTTAGCTGTTTAACri"rGAA 

The disclosed NOV30b nucleic acid sequence maps to chromosome 8 and has 216 of 
249 bases (86%) identical to a gb:GENBANK-lD:HUMTHYB4|acc:Ml 7733.1 mRNA from 
Homo sapiens (Human thymosin beta-4 mRNA, complete cds) (E = l.le"^). 

A disclosed NOV30b polypeptide (SEQ ID NO:90) is 66 amino acid residues in 
length and is presented using the one-letter amino acid code in Table 30E. The SignalP, Psort 
and/or Hydropathy results predict that NOV30b does not have a signal peptide and is likely to 
be localized to the microbody (peroxisome) with a certainty of 0.7095. In alternative 
embodiments, a NOV30b polypeptide is located to the mitochondrial matrix space with a 
certainty of 0.1000 or the lysosome (lumen) with a certainty of 0.1000. 
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Table 30E. Encoded NOV30b Protein Sequence (SEQ ED NO:90) 

MSDKSNMDEIEKFSKSKLKKTEMQEKNPQPSKBWIEQEKQAGFVMRRASPICTVHSTKHCFLFYFF 



The NOVSOb amino acid sequence was found to have 36 of 42 amino acid residues 
(85%) identical to^ and 37 of 42 amino acid residues (88%) similar to, the 44 amino acid 
residue ptnr:SPTREMBL~ACC:Q9NQQ5 protein from Homo sapiens (Human) 
(DJ1071L10.1 (THYMOSIN/INTERFERON-INDUCIBLE MULTIGENE FAMILY)) (E = 
5.0e'^^). 

Expression information was derived from the tissue sources of the sequences that 
were included in the derivation of the sequence of NOV30b. The sequence is predicted to be 
expressed in the following tissues because of the expression pattern of (GENBANK-ID: 
gb:GENBANK-ID:HUMTHYB4|acc:M17733.1), a closely related Human thymosin beta-4 
mRNA, complete cds homolog in species Homo sapiens: Lung, small cell carcinoma. 

NOVSOc 

A disclosed NOV30c (designated CuraGen Acc. No. CG57330-02), which encodes a 
novel Thymosin Beta-4-like protein and includes the 201 nucleotide sequence (SEQ ID 
NO:91) is shown in Table 30F. An open reading frame for the mature protein was identified 
beginning with an ATG initiation codon at nucleotides 31-33 and ending with a TAA stop 
codon at nucleotides 199-201. Putative untranslated regions are underlined in Table 30A, 
and the start and stop codons are in bold letters. 

Table 30R NOVSOc Nucleotide Sequence (SEQ ID NO:91) 

AGTGGGCATTQCTCAGCTTCCTCTGTGACTA TGTCI^ACAAGTCCaAT^^ 

TCGAAACTOAAGAAGACAGAAATGCAAGAGAAAAATCCACAGCCTTCC^AGGAATGGATCGAAC^ 
GCAGGCTTCGTAATGAGGCGTGCATCACCAATATGCACTAAGGGCGAATAA 

The disclosed NOVSOc nucleic acid sequence maps to chromosome X and has 162 of 
192 bases (84%) identical to a gb:GENBANK-ID:HUMTHYB4|acc:M 17733.1 mRNA from 
Homo sapiens (Human thymosin beta-4 mRNA, complete cds) (E = 7.5e"^'^). 

The NOV30C polypeptide (SEQ ID NO:92) is 56 amino acid residues in length and is 
presented using the one-letter amino acid code in Table 30G. The SignalP, Psort and/or 
Hydropathy results predict that NOV30c does not have a signal peptide and is likely to be 
localized to the nucleus with a certainty of 0.5600. In alternative embodiments, a NOVSOc 
polypeptide is located to the microbody (peroxisome) with a certainty of 0.3000, the 
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mitochondrial matrix space with a certainty of 0.1000, or the lysosome O^men) with a 
certainty of 0.1000. 



Table 30G. Encoded NOV30c Protein Sequence (SEQ ID NO:92) 

MSDKSNMDEIEKFSKSKLKKTEMQEKNPQPSKEWIEQEKQAGFVMRRASPICTKGE 



The NOV30c amino acid sequence was found to have 36 of 42 amino acid residues 
(85%) identical to, and 37 of 42 amino acid residues (88%) similar to, the 44 amino acid 
residue ptnr:SPTREMBL-ACC:Q9NQQ5 protein from Homo sapiens (Human) 
(DJ1071L10.1 (THYMOSIN/INTERFERON-INDUCIBLE MULTIGENE FAMILY)) (E == 
4.5e-'^). 

NOV30c is expressed in at least the following tissues: adrenal gland, bone marrow, 
brain - amygdala, brain - cerebellum, brain - hippocampus, brain - substantia nigra, brain - 
thalamus, brain -whole, fetal brain, fetal kidney, fetal liver, fetal lung, heart, kidney, 
lymphoma - Raji, mammary gland, pancreas, pituitary gland, placenta, prostate, salivary 
gland, skeletal muscle, small intestine, spinal cord, spleen, stomach, testis, thyroid, trachea 
and uterus. Expression information was derived from the tissue sources of the sequences that 
were included in the derivation of the sequence of NOV30c. 

Possible small nucleotide polymorphisms (SNPs) found for NOV30c are listed in 
Tables 30H and 301. 



Table 30H: SNPs 


Consensus Position 


Depth 


Base Change 


PAF 


16 


47 


G>T 


0.043 


32 


47 


T>C 


0.468 


183 


23 


G>A 


0.087 



Table 301: SNPs 


Variant 


Nucleotide 
Position 


Base Change 


Amino Acid 
Position 


Base Change 


13377029 


89 


A>G 


14 


Lys>Arg 


13377030 


148 


OT 


148 


Glri>End 


13377031 


150 


A>G 


150 


NA 
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Homologies to any of the above NOVSOa, NOV30b and NOV30c proteins will be 
shared by the other NOV30 proteins insofar as they are homologous to each other as shown 
above. Any reference to NOV30 is assumed to refer to NOV30a, NOV30b and NOV30c 
proteins in general, unless otherwise noted. 

NOV30a, NOV30b and NOV30c are very closely homologous as is shown in the 
amino acid alignment in Table 30J 



Table 30J. ClustalW of NOV30a and NOV30b 

10 20 30 40 50 

NOVSOa ^ 

NOVSOb 

NOVBOc 

60 

NOV30a RBg^gl 

NOV30b I^^HSTKHCFLFYFF 

NOV30C iBBtesH 



DSIEKFSKSKLKKTEMQEKNPQPSKEWIEQEKQAGFVMRRASF 
i^SDKSNMDEIEKFSKSKLKKTSMQEK^^PQPSKEWIEQEKQAGFVMRRASF 
^SDKSNMDSIEKFSKSKLKKTSMQEKNPQPSKEWIEQEKQAGFVMRRASP 



NOV30 also has homology to the amino acid sequences shown in the BLASTP data 
listed in Table 30K 



Table 30K. BLAST results for NOV30a 


Gene Index/ 
Identifier 


Protein/ Organism 


Length 
(aa) 


Identity 
(%) 


Positives 
<%) 


Expect 


gi 1 17451239 |ref|X 
P_070564.l| 

(XM_070564) 


similar to 
ribosomal protein 

LIO (H. sapiens) 
[Homo sapiens] 


158 


37/37 
(100%) 


37/37 
(100%) 


le-12 


gi j2143995|pir| |l 
52084 


thymosin beta- 4 
precursor - rat 
(fragment) 


56 


31/36 
(86%) 


31/36 
(86%) 


0.015 


gi| 136580 |sp|P200 
65 1 TYB4_MOUSE 


Thymosin beta- 4 (T 
beta 4) 


50 


31/36 
(86%) 


31/36 
(86%) 


0.089 


gi|464974|sp|P340 
32 1 TYB4_RABIT 


Thymosin beta- 4 (T 
beta 4) 


43 


31/36 
(86%) 


31/36 
(86%) 


0.089 


gi 1 10946578 |ref|N 
P_067253. 1 1 
(NM_021278) 


thymosin, beta 4, X 

chromosome ; 
prothymosin beta 4 
[Mus musculns) 


44 


31/36 
(86%) 


31/36 
(86%) 


0.089 



The homology of these sequences is shown graphically in the ClustalW analysis 
shown in Table SOL. 
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Table SOL. CliistalW Analysis of NOV30 

1) NOV30a (SEQ ID NO:88) 

2) NOVSOb (SEQ ID NO:90) 

3) NOV30C (SEQ ID NO:92) 

4) gi 17451239 (SEQ ID NO:379) 

5) gi 2143995 (SEQ ID NO:380) 

6) gi 136580 (SEQ ID NO:381) 

7) gi 464974 (SEQ ID NO:382) 

8) gi 10946578 (SEQ ID NO:383) 



NOV30a 

Novaob 

NOV30C 
gi 1 17451239 I 
gi I 2143995 | 
gi I 136580| 
gi 1464974 I 
g± 1 10946578 1 



NOV30a 

NOV30b 

NOV30C 

gi[ 174512391 

108 

gi 1 2143995 I 
gi 1 136580 I 
gi 1 464974 1 
gi 1 10946578 1 



MOV30a 
NOV30b 
NOV30C 
gi 1 17451239 I 
gi [2143995 I 
gi 1 136580 I 
gi 1 464974 I 
gi 1 10946578 I 



1 
1 
1 
1 
1 
1 
1 
1 



43 
49 
49 
49 

56 
SO 
43 
44 



44 
50 
50 




70 



80 



90 



100 



110 



120 



a p 44 

5 P 50 

I p 50 

ISSFLGGVHGLFLVWVALRVLGDRPFKCTFMSLTLHYPRCmiETGIQGT^GKPQGTVA^^ 



56 
50 
43 
44 



13 0 



140 



150 



160 



170 

.-|....|....|....|....|,...|....|....|....| 

ICTKGE 50 

ICTVHSTK HCFLFYFF 66 

ICTKGE 56 

109 HIGQVKSICTKI*QNKEHVIEAPCRAKFKFPGHQKIHISKKWGFTKFNVDE 158 

56 56 

50 - 50 

43 43 

44 44 



Tables 30M and SON list the domain description from DOMAIN analysis results 
against NOV30. This indicates that the NOV30 sequence has properties similar to those of 
other proteins known to contain these domains. 



Table 30M Domain Analysis of NOV30 

gnl I Smart I smart 00 152, THY, Thymosin beta actin-binding motif. 
CD-Length =37 residues, 97.3% aligned 
Score = 32.0 bits (71), Expect = 0.009 

NOV3 0: 1 MDEIEKFSKSKLKKTEMQEKNPQPSKEWIEQEKQAG 36 (SEQ ID NO: 3 84) 

IMM ' \\\\\ III Mil MINI 

Sbjct: 1 TDEIENFDSENLKKTETIEKNVLPSKEDIEQEKQLQ 36 (SEQ ID NO; 3 85) 
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Table 30N Domain Analysis of NOV30 

hxnmpfam - searcli a single seg against HMM database 
HMM file: pf amHMMs 

Scores for sequence family classification {score includes all 
domains) : 

Model Description Score E- value N 



Thymosin Thymosin beta-4 family 57.1 3.7e-l3 1 

(INTERPRO) 

Parsed for domains: 

Model Domain seq-f seq-t hmm-f hmm-t score E- value 

Thymosin l/l 1 36 [. 1 41 [3 57.1 3.7e-13 

Alignments of top- scoring domains: 

Thymosin: domain 1 of 1, from 1 to 36: score 57.1, E = 3.7e-l3 

*->sDKPdleEiasFDKaKLKKtEtqEKnpLPtKEtiEqEKqae<-* (SEQ ID NO:386) 

--ih^i uiiim mil hii iiimt^ 

NOV30a 1 MDEXEKFSKSKUCKTEMQEKNPQPSKEWIEQEKQAG 36 (SEQ ID NO: 3 87) 

Thymosin beta-4 is a small polypeptide whose exact physiological role is not yet 
known. It was first isolated as a thymic hormone that induces terminal deoxynucleotidyl- 
transferase. It is found in high quantity in thymus and spleen but is widely distributed in 
many tissues. It has also been shown to bind to actin monomers and thus to inhibit actin 
polymerization. See Interpro IPROO 11 52: 

A number of peptides closely related to thymosin beta-4 belong to this family. They 
include, thymosin beta-'9 (and beta-8) in bovine and pig, thymosin beta-10 in man and rat, 
thymosin beta-1 1 and beta- 12 in trout and human Nb thymosin beta. 

Thymosin was originally isolated from a partially purified extract of calf thymus, 
thymosin fraction 5, which induced differentiation of T cells and was partially effective in 
some immunocompromised animals. Further studies demonstrated that the molecule is 
ubiquitous; it had been found in all tissues and cell lines analyzed. It is found in highest 
concentrations in spleen, thymus, lung, and peritoneal macrophages. 

Thymosin-beta-4 (T-beta-4) is an actin monomer sequestering protein that may have a 
critical role in modulating the dynamics of actin polymerization and depolymerization in 
nonmuscle cells. Its regulatory role is consistent with the many examples of transcriptional 
regulation of T-beta-4 and of tissue-specific expression. Lymphocytes have a unique T-beta-4 
transcript relative to the ubiquitous transcript found in many other tissues and cells. Rat 
thymosin-beta-4 is synthesized as a 44-amino acid propeptide which is processed into a 43- 
amino acid peptide by removal of the first methionyl residue. The molecule does not have a 
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signal peptide. Human thymosin-beta-4 has a high degree of homology to rat thymosin-beta- 
4; the coding regions differ by only 9 nucleotides, and these are all silent base changes. 

A cDNA encoding thymosin-beta-4 has been isolated by differential screening of a 
cDNA library prepared from leukocj^es of an acute lymphocytic leukemia patient. Using 
Northern blot analysis^ the expression of the thymosin-beta-4 mRNA in various primary 
myeloid and lymphoid malignant cell lines and in hemopoietic cell lines was studied. The 
pattern of thymosin-beta-4 gene expression suggests that it may be involved in an early phase 
of the host defense mechanism. A cDNA clone for the human interferon-inducible gene 6-26 
has been isolated and shown to be identical to that for the human thymosin-beta-4 gene. By 
use of a panel of human rodent somatic cell hybrids, it has been shown that the cDNA 
recognized 7 genes, members of a multigene family, present on chromosomes 1, 2, 4, 9, 1 1, 
20, and X. These genes are symbolized TMSLl, TMSL2, etc., respectively. 

In the mouse there is a single Tmsb4 gene and the lymphoid-specific transcript is 
generated by extending the ubiquitous exon 1 with an alternate downstream splice site. By 
interspecific backcross mapping, the mouse gene (designated Ptmb4) has been located to the 
distal region of the mouse X chromosome, linked to Btk and Gja6. Thus, the human gene 
could be predicted to reside on the X chromosome in the general region of Xq21 3-q22, 
where BTK is located. By analysis of somatic cell hybrids, the thymosin-beta-4, or TB4X, 
gene was mapped to the X chromosome. A homologous gene, TB4Y, is present on the Y 
chromosome. The TB4X gene escapes X inactivation, and it has been suggested that it should 
be investigated as a candidate gene for Turner syndrome. Thymosin-beta-4 induces the 
expression of terminal deoxynucleotidyl transferase activity in vivo and in vitro, inhibits the 
migration of macrophages, and stimulates the secretion of hypothalamic luteinizing hormone- 
releasing hormone. It has also been suggested that thymosin beta-4 is required for the 
metastasis of melanoma cells. 

The protein similarity information, expression pattern, cellular localization, and map 
location for the NOV30 protein and nucleic acid disclosed herein suggest that this thymosin 
beta-4-like protein may have important structural and/or physiological functions 
characteristic of the thymosin beta-4 family. Therefore, the nucleic acids and proteins of the 
invention are useful in potential diagnostic and therapeutic applications and as a research 
tool. These include serving as a specific or selective nucleic acid or protein diagnostic and/or 
prognostic marker, wherein the presence or amount of the nucleic acid or the protein are to be 
assessed. These also include potential therapeutic applications such as the following: (i) a 
protein therapeutic, (ii) a small molecule drug target, (iii) an antibody target (therapeutic, 
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diagnostic, drug targeting/cytotoxic antibody), (iv) a nucleic acid useful in gene therapy (gene 
delivery/gene ablation), (v) an agent promoting tissue regeneration in vitro and in vivo, and 
(vi) a biological defense weapon. 

The NOV30 nucleic acids and proteins of the invention have applications in the 
diagnosis and/or treatment of various diseases and disorders. For example, the compositions 
of the present invention may have efficacy for the treatment of patients suffering from 
agammaglobulinemia, type 1, X-linked; agammaglobulinemia, X-linked; XLA and isolated 
growth hormone deficiency; premature ovarian failure; idiopathic thrombocytopenic purpura, 
immunodeficiencies, graft versus host disease; systemic lupus erythematosus, autoimmime 
disease, asthma, emphysema, scleroderma, ARDS; allergies, cancer, compromised immune 
system as well as other diseases, disorders and conditions. 

These antibodies may be generated according to methods known in the art, using 
prediction from hydrophobicity charts, as described in the "Anti-NOVX Antibodies" section 
below. The disclosed NOV30 protein has multiple hydrophilic regions, each of which can be 
used as an immunogen. In one embodiment, a contemplated NOV30 epitope is from about 
amino acids 1 1 to 13. In another embodiment, a contemplated NOV30 epitope is from about 
amino acids 14 to 16. In other specific embodiments, contemplated NOV30 epitopes are 
from about amino acids 17 tol9, 21 to 25, 26 to 27, 3 1 to 32, 35 to 36 and 37 to 41 . 

NOV31 

One NOVX protein of the invention, referred to herein as NOV31, includes two 
Myelin P2-like nucleic acids encoding the same protein. The disclosed nucleic acids have 
been named NOV31a andNOV31b. 

NOV31a 

A disclosed NOV31a (designated CuraGen Acc. No. CG57344~01), which encodes a 
novel Myelin P2-Iike protein and includes the 457 nucleotide sequence (SEQ ID NO:93) is 
shown in Table 31 A. An open reading frame for the mature protein was identified beginning 
with an ATG initiation codon at nucleotides 21-23 and ending with a TAA stop codon at 
nucleotides 441-443. Putative untranslated regions are underlined in Table 31 A, and the start 
and stop codons are in bold letters. 

Table 31A. NOV31a Nucleotide Sequence (SEQ ID NO:93) 

ATCAACTTATCTCAGACAGAATGATTGACCAGCTCCAAGGAACATGGAAGTCCATTTCTTGTGAAAATTCCGAAGACT 
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ACATGAAGGAGCTGGGTATAGGAAGAGCCAGCAGGAAACTGGGCCGTTTGGCAAAACCCACTGTGACCATCAGTACAG 
ATGGAGATGTCATCACAATAAAAACCAAAAGCATCTTTAAAAATAATGAGATCTCCTTTAAGCTGGGAGAAGAGTTTG 
AGGAAATCACGCCAGGTGGCCACAAAACAAAGAGTAAAGTAACCTTAGATAAGGAGTCCCTGATTCAAGTTCAGGACT 
GGGATGGCAAAGAAACCACCATAACGAGAAAGCTGGTGGATGGGAAAATGGTGGTGGAAAGTACT 
TCTGTAmCGgACATACGAGAAAGTATCa^TCaAACTCAGTCTCAAAC^ 

The disclosed NOV3 la nucleic acid sequence maps to chromosome 8 and has 298 of 
418 bases (71%) identical to a gb:GENBANK-ID:RABPLP2|acc:J03744.1 mRNA from 
Oryctolagus cuniculus (Rabbit myelin P2 mRNA, complete cds) (E = 3.9e'^*). 

NOV31b 

A disclosed NOV31b (designated CuraGen Acc. No. CG57344-02), also encodes a 
novel Myelin P2-like protein. This nucleic acid includes a 426 nucleotide sequence which 
differs from NOV31a by having a 20 nucleotide deletion at the 5' end (the 5'UTR), an 1 1 
nucleotide deletion at the 3' end and one mutation (T>C) at position 251 (numbered relative 
toNOV31a). An open reading frame for the mature protein was identified beginning with an 
ATG initiation codon at nucleotides 1-3 and ending with a TAA stop codon at nucleotides 
42 1 -423. Putative untranslated regions are underlined in Table 3 1 b, and the start and stop 
codons are in bold letters. 

The disclosed NOV31b nucleic acid sequence maps to chromosome 8 and has 291 of 
403 bases (72%) identical to a gb:GENBANK-ID:RABPLP2|acc:J03744.1 mRNA from 
Oryctolagus cuniculus (Rabbit myelin P2 mRNA, complete cds) (E = 5.8e'^*). 

TheNOV31 polypeptide (SEQ IDNO:94) is 140 amino acid residues in length and is 
presented usmg the one-letter amino acid code in Table 3 IB. The SignalP, Psort and/or 
Hydropathy results predict that NOV31a does not have a signal peptide and is likely to be 
localized to the cytoplasm with a certainty of 0.6500. In alternative embodiments, a NOV31a 
polypeptide is located to the mitochondrial matrix space with a certainty of 0.1000 or the 
lysosome (lumen) with a certainty of 0.1000, 



Table Encoded NOV31 Protein Sequence (SEQ ID NO:94) 

MIDQLQGTWKSISCENSEDYMKELGIGRASRKIXSRLAKPTVTISTO 

PGGHKTKSKVTIfDKESLIQVQDWDGKETTITRKLVDGKMVVESTV^ 

The NOV31 amino acid sequence was found to have 86 of 132 amino acid residues 
(65%) identical to, and 102 of 132 amino acid residues (77%) similar to, the 132 amino acid 
residue ptnr:pir-id:MPRB2 protein from rabbit (myelin P2 protein) (E = L7e"^^). 

NOV31 is expressed in at least the following tissues because of the expression pattern 
of (GENBANK-ID: gb:GENBANK-ID:RABPLP2|acc:J03744.1) a closely related Rabbit 
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myelin P2 mRNA, complete cds homolog in species Oryctolagus cuniculus rsciatic nerve, 
spinal cord, and brain. 

Possible small nucleotide polymorphisms (SNPs) found for NOV31 are listed in 
Table 3 IC. 



Table 31C: SNPs 


Consensus Position 


Depth 


Base Change 


PAF 


196 


21 


A>G 


0.095 



Homologies to any of the above NOV3 1 proteins will be shared by the other NOV3 1 
proteins insofar as they are homologous to each other as shov^n above. Any reference to 
NOV3 1 is assumed to refer to NOV3 1 a and NOV3 1 b proteins in general, unless otherwise 
noted. 

NOV3 1 also has homology to the amino acid sequences shown in the BLASTP data 
listed in Table 3 ID. 



Table 31D. BLAST results for NOVSla 


Gene Index/ 
Identifier 


Protein/ Organism 


Length 
(aa) 


Identity 
(%) 


Positives 
(%) 


Expect 


gi| 12838509 |dbj | 
BAB24227.lt 
(AK005765) 


data source :SPTR, 
source key: P2 452 6, 
evidence : ISS~putat 

ive-similar to 
MYELIN P2 PROTEIN 

[Mus musculusj 


132 


106/132 
(80%) 


119/132 
(89%) 


3e-52 


gij 127727 |sp|P02 
69l|MYP2_RABIT 


Myelin P2 protein 


132 


86/132 
(65%) 


102/132 
(77%) 


le-38 


gi|4505909 |ref |N 
P_002668.l| 
(KM 002677) 


peripheral myelin 
protein 2; M~FABP 
[Homo sapiens} 


132 


87/132 
(65%) 


101/132 
(75%) 


3e-38 


gi| 127726 lsp|P24 
526|MyP2_MOUSE 


Myelin P2 protein 


132 


82/132 
(62%) 


99/132 
(74%) 


6e-38 


gi| 1353194 |sp|P4 
8035 1 FABA_BOVIN 


Fatty acid-binding 
protein, adipocyte 
(AFABP) (Adipocyte 
1 ipi d-binding 
protein) (ALBP) 


132 


78/131 
(59%) 


100/131 
(75%) 


2e-37 



The homology of these sequences is shown graphically in the ClustalW analysis 
shown in Table 3 IE, 
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Table 31E- ClustalW Analysis of NOV31 



1) 

2) 
3) 
4) 
5) 
S) 
7) 



NOV31a 
NOVBlb 



gi 
gi 
gi 
gi 



X2838509 

127727 

4505909 

127726 

1353194 



(SEQ ID NO: 94) 
(SEQ ID NO: 96) 
(SEQ ID NO:388X) 
(SEQ ID NO:389) 
(SEQ ID NO:390) 
(SEQ ID NO:391) 
(SEQ ID NO:392) 



NOV31a 

NOVBlb 

gi I 12838509 j 

gi 1 127727 I 

gij 4505909 I 

gi 1 127726 1 

gij 1353194 I 



NOV31a 

NOV31b 

gij 12838509 I 

gij 127727 I 

gij 4505909 I 

gij 127726 I 

gij 1353194 1 



NOV31a 

NOV31b 

gi 1 12838509 I 

gij 127727 I 

gi i 4505909 j 

gij 127726 I 

gij 1353194 | 




Table 3 IF lists the domain description from DOMAIN analysis results against 
NOV31. This indicates that the NOV31 sequence has properties similar to those of other 
proteins known to contain these domains. 
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Table 31F Domain Analysis of NOV31 

gnl|Pfam|pfam0006l, lipocalin, Lipocalin / cytosolic fatty-acid binding 
protein family. Lipocalins are transporters for small hydrophobic 
molecules, such as lipids, steroid hormones, bilins, and retinoids. 
Alignment siibsumes both the lipocalin and fatty acid binding protein 
signatures from PROSITE. This is supported on structural and functional 
grounds. Structure is an eight -stranded beta barrel. 

CD-Length = 145 residues, 100.0% aligned 

Score =s 56.6 bits (135), E3^ect = 9e-X0 

NOV31: 4 QLQGTWKSISCENSEDYMK-ELGIGRASRKLGIOiAK-PTVTISTIXSDVITIKTKSIFK^ 61 

+ I I I + +1 111+ l+ll I +1111 I + I 

SbjCt: 1 KFAC3KWYLVASANFDPEIJCEELGVLEATRKEITPLKE61ILEIVFTCDKNGI-CEETTO 59 

NOV31: 62 EISFKLGEEFEEITPGGHKTKSKVTIJDKESI-IQVQDWDGKETTITRKLVIXSKMWESTV- 120 

I + III 11+ I 11+11 II 11+ I +1 

Sbjct: 60 EKTKKIiGVEPDYYT<a5NRFVVLiyrDYDNYIiVCVQKGDGNETSRT^^ 119 

NOV31: 121 NSVICTRTYEKV 132 (SEQ ID NO: 393) 

Sbjct: 120 EliFETATKEIiGIPEDNWCTRQTERC 145 (SEQ ID NO; 394) 

See InterPro IPR000463: Cytosolic fatty-acid binding protein. The Fatty Acid- 
Binding Proteins (FABPs) are a family of proteins that are principally located in the cytosol 
and are characterized by the ability to bind to hydrophobic ligands, such as fatty acids, 
retinol, retinoic acid, bile salts and pigments. Recently, a number of family members have 
been identified that are secreted, such as gastrotropin and mammary-derived growth inhibitor. 
The family is implicated in general lipid metabolism, acting as intracellular transporters of 
hydrophobic metabolic intermediates and as carriers of lipids between membranes. The 
FABPs exhibit a high degree both of sequence and structural similarity. They are small, 12- 
18 kDa, soluble proteins composed of 1 10-160 residues. Their crystal structures show them 
to be 10-stranded anti-parallel beta- barrels with a +1,+1 topology, which wrap around an 
internal cavity to form a ligand binding site. The anti-parallel beta-barrel fold is also 
exploited by the lipocalins, which function similarly by binding small hydrophobic 
molecules. Similarity at the sequence level, however, is less obvious, being confined to a 
single short N-terminal motif. Proteins which transport small hydrophobic molecules such as 
steroids, bilins, retinoids, and lipids share limited regions of sequence homology and a 
common tertiary structure architecture. This is an eight stranded antiparallel beta-barrel with 
a repeated + 1 topology enclosing a internal ligand binding site. The name lipocalin' has been 
proposed for this protein family, but cytosolic fatty-acid binding proteins are also included. 
The sequences of most members of the family, the core or kemal lipocalins, are characterized 
by three short conserved stretches of residues, while others, the outlier lipocalin group, share 
only one or two of these. 
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Myelin is a multilamellar compacted membrane structure that surrounds and insulates 
axons, facilitating the conduction of nerve impulses. It is composed predominantly of lipids, 
with proteins accounting for about 30% of its net weight. Schwann cells are responsible for 
myelin formation in the peripheral nervous system. Peripheral myelin protein-2 (PMP2X a 
small basic protein, is one of the major proteins of peripheral myelin and appears to be 
related to the transport of fatty acids or the metabolism of myelin lipids. Hayasaka et al. 
(1991) noted that PMP2 (which they also called myelin P2 protein, MP2) was shown to have 
lipid^binding activity. Thus, MP2 protein may have an unportant role in the organization of 
compact myelin. 

Hayasaka et al. (1991) isolated a full-length cDNA of MP2 protein of peripheral 
myelin from a cDNA library of human fetus spinal cord. It was found to contain a 393-bp 
open reading frame encoding a polypeptide of 131 residues. The deduced amino acid 
sequence is highly homologous to myelin P2 protein from other species. Hayasaka et al. 
(1993) cloned the genomic PMP2 sequence, which is about 8 kb long and consists of 4 exons. 
By spot-blot hybridization (FISH) of flow-sorted human chromosomes and fluorescence in 
situ hybridization, Hayasaka et al. (1993) mapped the PMP2 gene to chromosome 8q21.3- 
q22.1. This is the same region as that in which the autosomal recessive form of Charcot- 
Marie-Tooth peroneal muscular atrophy (CMT4A) has been mapped. Thus, the PMP2 gene 
was a prime candidate for the site of the mutation in that disorder. Narayanan et al. (1994) 
reported the partial structure of the PMP2 gene. Using a panel of human/hamster somatic cell 
hybrids and by FISH, they localized the gene to 8q21 . Ben Othmane et al. (1995) created a 7- 
Mb YAC contig spanning the region of 8ql3-q21 to which the CMT4A gene was mapped. 
This contig was used to map 9 additional microsatellites and 6 STSs to this region; 
subsequent haplotype analysis narrowed the CMT4A flanking interval to less than 1 cM. 
Using SSCP and the physical map, they could demonstrate that the PMP2 gene is not the 
defect in CMT4A, 

Myelin P2 is a 14,800-Da cytosolic protein found in rabbit sciatic nerves. It belongs to 
a family of fatty acid binding proteins and shows a 72% amino acid sequence similarity to 
aP2/422, the adipocyte lipid binding protein, a 58% sequence similarity to rat heart fatty acid 
binding protein, and a 40% sequence similarity to cellular retinoic acid binding protein. In 
order to isolate cDNA clones representing P2, a cDNA library was constructed from 
poly(A+) RN A isolated from sciatic nerves of 1 0-day-old rabbit pups. By use of a mixed 
synthetic oligonucleotide probe based on the rabbit P2 amino sequence, 12 cDNA clones 
were selected from about 25,000 recombinants. Four of these were further characterized. 
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They contained an open reading frame, which when translated, agreed at 128 out of 131 
residues with the known rabbit P2 amino acid sequence. These cDNAs recognize a 1 .9- 
kilobase mRNA present in sciatic nerve, spinal cord, and brain, but not present in liver or 
heart. The levels of P2 mRNA parallel myelin formation in sciatic nerve and spinal cord with 
maximal amounts being detected at about 15 postnatal days. P2 protein is a small basic 
protein (Mr = 14,820) found in peripheral nerve myelin and spinal cord myelin. There is now 
overwhelming evidence that P2 protein is the crucial antigen involved in the induction of 
experimental allergic neuritis, an autoimmune disease of the peripheral nervous system. The 
complete amino acid sequence of rabbit P2 protein was derived by sequence analysis of 
cyanogen bromide peptides and peptides obtained by proteolysis using Staphylococcus 
aureus V8 enzyme, trypsin, or clostripain. There are 131 amino acids and an excess of the 
basic amino acids lysine and arginine; histidine is absent. There are 3 highly hydrophobic 
regions in the P2 molecule. Probability analysis of the sequence predicts a high degree of beta 
structure, essentially in agreement with CD data. 

The protein similarity information, expression pattern, cellular localization, and map 
location for the NOV31 protein and nucleic acid disclosed herein suggest that this Myelin P2- 
like protein may have important structural and/or physiological functions characteristic of the 
Fatty Acid Binding Protein family. Therefore, the nucleic acids and proteins of the invention 
are useful in potential diagnostic and therapeutic applications and as a research tool. These 
include serving as a specific or selective nucleic acid or protein diagnostic and/or prognostic 
marker, wherein the presence or amount of the nucleic acid or the protein are to be assessed. 
These also include potential therapeutic applications such as the following: (i) a protein 
therapeutic, (ii) a small molecule drug target, (iii) an antibody target (therapeutic, diagnostic, 
drug targeting/cytotoxic antibody), (iv) a nucleic acid useful in gene therapy (gene 
delivery/gene ablation), (v) an agent promoting tissue regeneration in vitro and in vivo, and 
(vi) a biological defense weapon. 

The NOV31 nucleic acids and proteins of the invention have applications in the 
diagnosis and/or treatment of various diseases and disorders. For example, the compositions 
of the present invention will have efficacy for the treatment of patients suffering from: 
Charcot-Marie-Tooth peroneal muscular atrophy, allergic neuritis (an autoimmune disease of 
the peripheral nervous system). Von Hippel-Lindau (VHL) syndrome, Alzheimer's disease, 
stroke, tuberous sclerosis, hypercalceimia, Parkinson's disease, Huntington's disease, cerebral 
palsy, epilepsy, Lesch-Nyhan syndrome, multiple sclerosis, ataxia-telangiectasia. 
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leukodystrophies, behavioral disorders, addiction, anxiety, pain, neuroprotection as well as 
other diseases, disorders and conditions. 

These antibodies may be generated according to methods known in the art, using 
prediction from hydrophobicity charts, as described in the "Anti-NOVX Antibodies" section 
below. The disclosed NOV31 protein has multiple hydrophilic regions, each of which can be 
used as an immunogen. In one embodiment, a contemplated NOV31 epitope is from about 
amino acids 10 to 12. In another embodiment, a contemplated NOV31 epitope is from about 
amino acids 20 to 21. In other specific embodiments, contemplated NOV31 epitopes are 
from about amino acids 22 to 25, 30 to 31, 38 to 42, 50 to 51, 58 to 60, 65 to 67, 70 to 73, 75 
to 78, 81 to 83, 84 to 85, 86 to 87, 90 to 100, 105 to 1 10, 1 10-1 12, 121 to 123 and 130 to 133. 

NOV32 

One NOVX protein of the invention, referred to herein as NOV32, includes two 
Testis Lipid-Binding Protein-like proteins. The disclosed proteins have been named NOV32a 
andNOV32b. 

NOV32a 

A disclosed NOV32a (designated CuraGen Acc. No. CG57346-01), which encodes a 
novel Testis Lipid-Binding Protein-like protein and includes the 408 nucleotide sequence 
(SEQ ID NO:95) is shown in Table 32A. An open reading frame for the mature protein was 
identified beginning with an ATG initiation codon at nucleotides 10-12 and ending with a 
TGA stop codon at nucleotides 400-402. Putative untranslated regions are underlined in 
Table 32A, and the start and stop codons are in bold letters. 

Table 32A. NOV32a Nucleotide Sequence (SEQ ID NO:95) 

TGTTCCATGA TGGTTGAGCCCTTCTTGGGAACCTGGAAGCTGGTCTCCAGTGflJ^ 

AACTGGGTTTCGCAGCCCGGAACaTGGCaGGGTTAGTGAi^CCGACAGTAACTATTAGTGT^ 

GACCATAAGAACAGAAAGTTCTTTCCAGGACACTAAGATCTCCTTCAAGCTGGGGGAAGAAT^ 

GCaGACAACaSGAAACTAAAGAGOVCCaTAACATTAGAGAATGGCTCA^ 

AAGAGACAACAATCAAAAGAAAAATTGTGGATGAAAAAATGGTAGTGGAATGTAAAATGAAT 

CAGAATCTACGAAAAGGTGTGAAGAAAG 

The disclosed NOV32a nucleic acid sequence maps to chromosome 8 and has 321 of 
413 bases (77%) identical to agb:GENBANK-ID:RRU07870|acc:U07870.1 mRNA from 
Rattus norvegicus (Rattus norvegicus testis lipid binding protein mRNA, complete cds) (E = 
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A disclosed NOV32a polypeptide (SEQ ID NO:96) is 130 amino acid residues in 
length and is presented using the one-letter amino acid code in Table 32B. The SignalP, 
Psort and/or Hydropathy results predict that NOV32a does not have a signal peptide and is 
likely to be localized to the cytoplasm with a certainty of 0.4500. In alternative 
embodiments, a NOV32a polypeptide is located to the mitochondrial matrix space with a 
certainty of 0.1000, tfie lysosome (lumen) with a certainty of 0.1000 or the microbody 
(peroxisome) with a certainty of 0.1000. 

Table 32B. Encoded NOV32a Protein Sequence (SEQ ID NO:96) 

IWEPFIiGTWKLVSSENFEDYMKELGFAARNMAGLVKPTVTISVDGK^ 

NRKVKSTITIjENGSMIHVQKWIiGKETTIKRKIVDEKMVVECKMNNIVSTR^ 

The NOV32a amino acid sequence was found to have 90 of 132 amino acid residues 
(68%) identical to, and 1 12 of 132 amino acid residues (84%) similar to, the 132 amino acid 
residue ptnr:SWISSPROT-ACC:O08716 protein from Mus musculus (Mouse) (TESTIS 
LIPID BINDING PROTEIN (TLBP) (1 5 KDA PERFORATORIAL PROTEIN) (PERF 1 5)) 
(E = 3.1e"^). 

NOV32a is predicted to be expressed in testis because of the expression pattern of 
(GENBANK-ID: gb:GENBANK-ID:RRU07870|acc:U07870.1), a closely related Rattus 
norvegicus testis lipid binding protein mRNA, complete cds homolog in species Rattus 
norvegicus. 

NOV32b 

A disclosed NOV32b (designated CuraGen Acc. No. CG57346-02), which encodes a 
novel Testis Lipid Binding Protein-like protein and includes the 459 nucleotide sequence 
(SEQ ID NO:97) is shown in Table 32C. An open reading frame for the mature protein was 
identified beginning with an ATG initiation codon at nucleotides 28-30 and ending with a 
TGA stop codon at nucleotides 427-429. Putative untranslated regions are underlined in 
Table 32b, and the start and stop codons are in bold letters. 

Table 32C. NOV32b Nucleotide Sequence (SEQ ED NO:97) 

CGAGTGGCTCTTCTCAGCAAGTGTTCCA TGATGGTTGAGCCCTTCTTGGGAACCTGGAAGCTGGTCTCCAGTGAa. 
AACTTTGAGGATTACATGAAAGAACTGGGTGTGAATTTCGCAGCCCGGAACATGGCAGGGTTAGTGAAACCGACA 
GTAACTATTAGTGTTGATGGGAAAATGATGACCATAAGAACAGAAAGTTCTTTCCAGGACACTAAGATCTCCTTC 
AAGCTGGGGGAAGAATTTGATGAAACTACAGCAGACAACCGGAAAGTAAAGAGCACCATAACATTAGAGAATGGC 
TCAATGATTCACGTCCAAAAATGGCTTGGCAAAGAGACAACAATCAAAAGAAAAATTGTGGATGAAAAAATGGTA 
ClTnaA&TGTAAAATGAATAATATTGTCAGCACCAGAATCTACGAAAAGGTGTG AAGAAAGGTCCACAGCAATGAA 

AACTTGTTC 
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The disclosed NOV32b nucleic acid sequence maps to chromosome 8 and has 347 of 
446 bases (77%) identical to a gb:GENBANK-ID:RRU07870|acc:U07870.1 mRNA from 
Rattus norvegicus (Rattus norvegicus testis lipid binding protein mRNA, complete cds) (E = 
3.5e"^^). 

The NOV32b polypeptide (SEQ ID NO:98) is 133 amino acid residues in length and 
is presented using the one-letter amino acid code in Table 32D. The SignalP, Psort and/or 
Hydropathy results predict that NOV32b does not have a signal peptide and is likely to be 
localized to the cytoplasm with a certainty of 0.6500. In alternative embodiments, a NOV32b 
polypeptide is located to the mitochondrial matrix space with a certainty of 0.1000, the 
lysosome (lumen) with a certainty of 0.1000 or the microbody (peroxisome) with a certainty 
ofO.0138. 



Table 32D. Encoded NOV32b Protein Sequence (SEQ ID NO;98) 

M^WEPFIiGTWKI,VSSBNFEDYMKEIlGVNFAARN^IAGLVKPTVTISVDGKm 

ETTADNRKVKSTITLENGSMIHVQKWLGKETTIKRKIVDEKMVVECK^^ 

The NOV32b amino acid sequence was found to have 91 of 132 amino acid residues 
(68%) identical to, and 1 1 3 of 1 32 amino acid residues (85%) similar to, the 1 32 amino acid 
residue ptnr:SWISSPROT-ACC:O08716 protein from Mus musculus (Mouse) (TESTIS 
LIPID BINDING PROTEIN (TLBP) (15 KDA PERFORATORIAL PROTEIN) (PERF 15)) 
(E=L5e-^^). 

NOV32b is predicted expressed in at least the Testis. Expression information was 
derived from the tissue sources of the sequences that were included in the derivation of the 
sequence of NOV32b. The sequence is also predicted to be expressed in the estis because of 
the expression pattern of (GENBANK-ID: gb:GENBANK-ID:RRU07870|acc:U07870.1) a 
closely related Rattus norvegicus testis lipid binding protein mRNA, complete cds homolog 
in Rattus norvegicus. 

Homologies to any of the above NOV32a and NOV32b proteins will be shared by the 
other NOV32 proteins insofar as they are homologous to each other as shown above. Any 
reference to NOV32 is assumed to refer to NOV32a and NOV32b proteins in general, unless 
otherwise noted. 

NOV32a and NOV32b are very closely homologous as is shown in the amino acid 
alignment in Table 32E. 
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Table 32E. ClustalW of NOV32a and NOV32b 

10 20 30 40 



20 



30 



50 



NOV32a 
NOV32b M 



NOV32a 
NOV32b 



NOV32a 
NOV32b 



MVE?FLGTWKLVSSSNFEDYMKELG■|FAARN^IAGLVKFTVTISVDGK>: 
MVEPFLGTWKLVSSENFEDYMKELGfflpAARNMAGLVKPTVTISVDGKK 



47 
50 





60 70 


80 90 
|....|....|....|.... 


100 


MTIR 
MTIR 


tsssfqdtkisfklgeefde 
tsssfqdtkisfklgeefds 


TTADNRKVKSTITLENGSMI 
TTADNRKVKSTITL.ENGSMI 


hvqkwlKBH 




110 120 

1 .... 1 . ...I....!.... 


130 




Igket 
|gket 


TIKRKIVDSKMWECXMNNI 
TIKRKIVDEKMWECKMNNI 


^M^£|abS 130 
BiSHBaHB 133 





NOV32a also has homology to the amino acid sequences shown in the BLASTP data 
listed in Table 32F. 



Table 32F. BLAST results for NOV32a 


Gene Index/ 
Identifier 


Protein/ Organism 


Length 
(aa) 


Identity 
(%) 


Positives 
(%) 


Ebcpect 


gi [l7449600|ref | 
XP_070467,l| 
(XM_070467) 


similar to RIKEN 
cDNA 1700007P10 
gene (H. sapiens) 
[Homo sapiens J 


132 


130/132 
(98%) 


130/132 
(98%) 


le-58 


gi 1 13386216 |ref| 
NP_081557.l| 
(KM 027281) 


RIKEN CDNA 
1700007P10 [Mus 
mus cuius] 


132 


93/132 
(70%) 


113/132 
(85%) 


2e-44 


gi 1 6755801 |ref|N 
P_035728.l| 
(NM 011598) 


testis lipid 
binding protein 
[Mus musculus] 


132 


90/132 
(68%) 


112/132 
(84%) 


7e-44 


gi [ 12408304 |ref[ 
NP_074 045- 1 1 
(NM_022854) 


testis lipid 
binding protein 
[Rattus 
norvegicus] 


132 


89/132 
(67%) 


112/132 
(84%) 


2e-43 


gi j 14423683 |sp|0 
97788 |FABA_PIG 


Fatty acid-binding 
protein, adipocyte 

(AFABP) (Adipocyte 
lipid-binding 
protein) (AI*BP) 

(A-FABP) (AP2) 


132 


84/131 
(64%) 


111/131 
(84%) 


3e-41 



The homology of these sequences is shown graphically in the ClustalW analysis 
shown in Table 32G. 
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Table 32G. ClustalW Analysis of NOV32 



■ S 3' 



1) 

2) 
3) 
4) 
5) 
6> 
7) 



NOV32a 

NOV32b 

gi 1 17449600 

gi 1 13386216 

gi I 6755801 

gi 1 12408304 

gi 14423683 



<SEQ ID NO: 96) 
{SEQ ID NO: 98) 
(SEQ ID NO:395) 
(SEQ ID NO: 396) 
(SEQ ID NO: 397) 
(SEQ ID NO:398) 
(SEQ ID NO:399) 



NOV32a 

NOV32b 

gi 1 17449600 I 

gi 1 13386216 I 

gi 16755801 I 

gi 1 12408304 I 

gi I 14423683 [ 



NOV32a 


58 1 


NOV32b 


61 j 


gi| 174496001 


60 j 


gi| 13386216] 


60 ] 


gi| 6755801 1 


60 1 


gi| 124083041 


60 1 


gi 114423683) 


60 1 


NOV32a 


118 1 


NOV32b 


121 1 



gi 1 13386216 I 
gi I 6755801 1 
gi I 12408304 I 
gi 1 14423683 I 



120 
120 
120 
120 
120 




Table 32H lists the domain description from DOMAIN analysis results against 
NOV32. This indicates that the NOV32 sequence has properties similar to those of other 
proteins known to contain these domains. 
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Table 32H Domain Analysis of NOV32 

gnl|Pfam|pfaTn00061, lipocalin, Lipocalin / cytosolic fatty-acid binding 
protein family. Lipocalins are transporters for small hydropliobic 
molecules, such as lipids, steroid hormones, bilins, and retinoids. 
Alignment subsumes both the lipocalin and fatty acid binding protein 
signatures from PROSITE. This is supported on stanictural and functional 
grounds. Structure is an eight- stranded beta barrel. 

CHD-Length = 145 residues, 87.6% aligned 

Score = 57.8 bits (138), Expect - 4e-10 

NOV32:5 FLGTWKLVSSENFEDYMKE LGFAARNMAGLVK-PTVTISVDGKMMTIRTESSFQOTK 60 

I I I Ikl 11+ +11+11 +1 + I tl 1 1+ + I 

Sbjctr2 FAGKWYLVASANFDPELKEELGVLEATRKEITPLKECaJnJEIVFDGDKNGICEETFGK^ 61 

NOV32:61 ISFKIjGEEFDETTADNRKVKSTITLENGSMIHVQKWLGKETTIKRKIVDBKMVVECKM^ 120 

ill III I III I +1 ++ III 111+ ++ 

Sb j C t : 62 TK- KLGVEFDyYTGDNRFVVLiyroyDNyLLVCVQKGIX3NETSRTAEI.YGRTPEL 120 

NOV32:121 IVSTRIYE 128 (SEQ ID NOr400) 
+ I I 

Sbjct:121 LFETATKE 128 (SEQ ID NO:401) 



The fatty acid-binding protein (FABP) family consists of small, cytosolic proteins 
believed to be involved in the uptake, transport, and solubilization of their hydrophobic 
ligands. Recently, a number of family members have been identified that are secreted, such as 
gastrotropin and mammary-derived growth inhibitor. The family is implicated in general lipid 
metabolism, acting as intracellular transporters of hydrophobic metabolic intermediates and 
as carriers of lipids between membranes. The family is implicated in general lipid 
metabolism, acting as intracellular transporters of hydrophobic metabolic intermediates and 
as carriers of lipids between membranes. Members of this family have highly conserved 
sequences and tertiary structures, and have probably diverged from a common ancestor. 
Using an antibody against testis lipid-binding protein, a member of the FABP family, 
Kingma et ah (1998) identified a protein from bovine retina and testis that coeluted with 
exogenously added docosahexaenoic acid during purification. Amino acid sequencing and 
subsequent isolation of its cDNA revealed it to be nearly identical to a bovine protein 
expressed in the differentiating lens and to be the likely bovine homologue of the human 
epidermal fatty acid-binding protein (E-FABP). From quantitative Western blot analysis, it 
was estimated that bovine E-FABP comprised 0.9%, 0.1%, and 2.4% of retina, testis, and 
lens cytosolic proteins, respectively. Binding studies using the fluorescent probe ADIFAB 
indicated that this protein bound fatty acids of differing levels of saturation with relatively 
high affinities. Kd values ranged from 27 to 97 nM. In addition, the protein was 
immunolocalized to the Muller cells in the retina as well as to Sertoli cells in the testis. The 
location of bovine E-FABP in cells known to be supportive to other cell types in their tissues 
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and the ability of E-FABP to bind a variety of fatty acids with similar affinities indicate that it 
may be involved in the uptake and transport of fatty acids essential for the nourishment of the 
surroundbig cell types. See InterPro IPR000463. 

The protein similarity information, expression pattern, cellular localization, and map 
location for the NOV32 protein and nucleic acid disclosed herein suggest that this Testis 
Lipid Binding Protein-like protein may have important structural and/or physiological 
functions characteristic of the fatty-acid binding protein family. Therefore, the nucleic acids 
and proteins of the invention are useful in potential diagnostic and therapeutic applications 
and as a research tool. These include serving as a specific or selective nucleic acid or protein 
diagnostic and/or prognostic marker, wherein the presence or amount of the nucleic acid or 
the protein are to be assessed. These also include potential therapeutic applications such as 
the following: (i) a protein therapeutic, (ii) a small molecule drug target, (iii) an antibody 
target (therapeutic, diagnostic, drug targeting/cytotoxic antibody), (iv) a nucleic acid useful in 
gene therapy (gene delivery/gene ablation), (v) an agent promoting tissue regeneration in 
vitro and in vivo, and (vi) a biological defense weapon. 

The NOV32 nucleic acids and proteins of the invention have applications in the 
diagnosis and/or treatment of various diseases and disorders. For example, the compositions 
of the present invention will have efficacy for the treatment of patients suffering from: 
fertility as well as other diseases, disorders and conditions. 

These antibodies may be generated according to methods known in the art, using 
prediction from hydrophobicity charts, as described in the "Anti-NOVX Antibodies'* section 
below. The disclosed NOV32 protein has muhiple hydrophilic regions, each of which can be 
used as an immunogen. In one embodiment, a contemplated NOV32 epitope is from about 
amino acids 15 to 25. In another embodiment, a contemplated NOV32 epitope is from about 
amino acids 26 to 28. In other specific embodiments, contemplated NOV32 epitopes are 
from about amino acids 48 to 50, 52 to 60, 61 to 64, 68 to 71, 76 to 78, 82 to 83, 97 to 98, 99 
to 101, 104 to 107, 114 to 116, 118 to 119 and 122 to 124. 

NOV33 

A disclosed NOV33 (designated CuraGen Acc. No. CG57356-01), which encodes a 
novel Intracellular Thrombosopondin Domain Containing Protein-like protein and includes 
the 1238 nucleotide sequence (SEQ ID NO:99) is shown in Table 33A. An open reading 
frame for the mature protein was identified beginning with an TAG initiation codon at 
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nucleotides 2-4 and ending with a TAA stop codon at nucleotides 1236-1238. Putative 
untranslated regions are underlined in Table 33b, and the start and stop codons are in bold 
letters. 



Table 33A. NOV33 Nucleotide Sequence (SEQ ID NO:99) 

GTACGTCTAGTCCTGAAACCAGCTT^ 

CCAACCCTTCCCCAGACCGCGATTCCGACAAGAGACGGGGCACCCTTCATTGCAAAGAGATTTCCCaVGA^ 
CTCCTTGATCTACCAAACTTTCCAGATCTTTCCAAAGCTGATATCAATOGGCAGT^ 

TAGAGGTGGTCGACGGTCCTGACTCTGAAGCa^ATAAAGATCAGCT^TCCGGAGAATAAGCCCAGCTGGTCAGTCCC 

ATCCCCCGACTGGCGGGCCTQGTGGCAGAGGTCCCTGTCCTTGGCCAGGGCAAACAGCGGGGACCATO 

TACGACAGTACCTCAGACGACAGCAACTTCOTCAACCCCCCCAGGGGGTGGGACCATACAGCCCCAGGCm^ 

CTTTTGAAACCAAAGATCAGCCAGAATATGATTCCACAGATGGCGAGGGTGACTGGAGTCTCTGGTCTG^ 

CGTCACCTGCGGGAACGGCAACCAGAAACGGACCCGGTCTTGTGGCTACGCGTGCACT 

TGTGACCGTCCAAACTGCCCAGGAATTGAAGACACTTTTAGGACAGCTGCCaiCCG 

GCGAGGAGTTTAATGCCACCTUVACTCTTTGAAGTTGACACAGACaGCTGTGAGCG^^ 

GTTCTTAAAGAAGTAOVTGCACAAGGTGATGAATGACCTGCCCAGCTGCCCCTGCTCCTACCCCACTGAGGT^ 
TAO^GCACGGCTGACM'CrrCGACCGCATCAAGCGCAAGGACrTCCGCTGGAAGGACGCCAGC^ 
AGCTGGAGATCTACMGCCCACTGCCCGGTACTGCATCCGCTCCATGCrGTCCCTGGAGAGCACC^ 
ACAGCACTGCTGCTACGGC^ACAACATGCAGCTCATCACCAGGGGCAAGGGGGCGG^ 

accgagttctccgcggagctccactacaaggtggacgtcctgccctggattatctgo^gggtg^ 
ataacgaggccx:ggcctcccaacaacggacaggagtgcacagagagcccctcggacgaggactacatcaagc^ 

CCAAGAGGCCTiiGGGAATATTAA^ 



The disclosed NOV33 nucleic acid sequence maps to chromosome 7 and has 373 of 
512 bases (72%) identical to a gb:GENBANK-ID:AFl 1 1 168|acc:AFl 1 1168,2 mRNA from 
Homo sapiens (Homo sapiens serine palmitoyl transferase, subunit II gene, complete cds; and 
unknown genes) (E = 23e"^^). 

A disclosed NOV33 polypeptide (SEQ ID NO: 100) is 41 1 amino acid residues in 
length and is presented using the one-letter amino acid code in Table 33B. The SignalP, 
Psort and/or Hydropathy results predict that NOV33 does not have a signal peptide and is 
likely to be localized to the cytoplasm with a certainty of 0.6500. In alternative 
embodiments, a NOV33 polypeptide is located to the mitochondrial matrix space with a 
certainty of 0.1000 or the lysosome (lumen) with a certainty of 0.1000. 



Table 33B. Encoded NOV33 Protein Sequence (SEQ ID NOrlOO) 

TCSPETSFSI^KEAPREHLDHQAAHQPFPRPRFRQETGHPSLQRDFPRSFLIiDIiPNFPDLSKADINGQNPNIQ 

VTIEVVDGPDSEADKDQHPENKPSWSVPSPDWRAWWQRSLSIARANSGDQDYKYDSTSDDSNFLNPPRGi^ 

APGHRTFETKDQPEyDSTIX3EGDWSI.WSVCSVTCGNGNQKRTRSCGyACTATESRTCDRPNCPGIEDTFRTM 

TEVSIiIAGSEEFNATKLFEVDTDSCERWMSCKSEFLKKYMHKVMNDI^SCPCSYPTE^ 

FRWKDASGPKEKIJEIYKPTARYCIRSMLSLESTTLAAQHCCyGDNMQLITRGKGAGTPNIiIGTO 

DVLPWIICKGPWSRYMET^PPNHGQECTESPSDEDYIKQFQEAREY 



The NOV33 amino acid sequence was found to have 162 of 164 amino acid residues 
(98%) identical to, and 163 of 164 amino acid residues (99%) similar to, the 361 amino acid 
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residue ptnr:TREMBLNEW-ACC:CAC16127 protein from Homo sapiens (Human) 
(BA149I18.1 (NOVEL PROTEIN)) (E - 3.6e^% 

NOV33 is predicted expressed in at least the following tissues: : lung, testis, and b- 
cell. Expression information was derived from the tissue sources of the sequences that were 
included in the derivation of the sequence of NOV33. 

NOV33 also has homology to the amino acid sequences shown in the BLASTP data 
listed in Table 33C. 



Table 33C. BLAST results for NOV33 


Gene Index/ 
Xdentxfier 


Protein/ Organism 


Length 
(aa) 


Identity 
{%) 


Positives 
(%) 


Expect 


gi| 13374941 letnbl 
CAC16127.2| 
(AL133463) 


bA149I18.1 (novel 
protein) [Homo 
sapiens] 


391 


389/391 
(99%) 


390/391 
(99%) 


0.0 


gi| 4186183 |gb|AA 
D09622.l| 
{AF111168) 


xinknown [Homo 
sapiens] 


658 


178/392 
(45%) 


238/392 
(60%) 


5e-82 


gi| 17389974 |gb| A 
AH17997.llAAH179 
97 (BC017997) 


Unknown (protein 
for IMAGE:4252124) 
[Homo sapiens] 


151 


149/151 
(98%) 


150/151 
(98%) 


6e-82 


gi| 13559287 |eitib| 

CAC3 6074 .1 1 
(AL050320) 


dJ1077I2.1 (novel 
protein) [Homo 
sapiens] 


60 


49/49 
(100%) 


49/49 
(100%) 


3e-20 


gi [4502359 |ref|N 
P 001695. l| 
<NM_001704) 


brain- specif ic 
angi ogenes i s 
inhibitor 3 [Homo 
sapiens] 


1522 


28/66 
(42%) 


36/66 
(54%), 

Gaps = 
10/66 
(15%) 


6e-05 



The homologous regions of these sequences is shown graphically in the ClustalW 
analysis shown in Table 33D. 



Table 33D. ClustalW Analysis of NOV33 



1) NOV33 



(SEQ ID NO: 100) 



2) gi 

3) gi 

4) gi 

5) gi 



13374941 (SEQ ID NO: 402) 

4186183 (SEQ ID NO:403) 

17389974 (SEQ ID NO:404) 

13559287 (SEQ ID NO: 405) 



NOV33 

gi 1 13374941 1 
gi [4186183 | 
gi 1 173 89974 I 
gi 1 13559287 I 



130 



140 150 160 170 180 

I. ...[.-. .1 

1 TCSPg|SFS^ ggEfgJ 19 

1 1 1 

121 VHSHGDKDs33<^IR^ASPDPRPLS|BEEAPLL^Tig|QAEPHQHGCWTVTEPA;^ 180 

1 1 

5 VGS--DTTS^SFSg jgElS! 26 



230 



HOV33 

gi 
gi 
gi 



19 

133749411 1 

4186183 1 181 ATPPRTPEVTPLRUSLQKLPGLaNTTLSTPN] 
173899741 1 



240 




252 



gl| 135592871 


26 


NOV33 


44 


gx 1 133 j 


24 


gi 14186183 | 


241 


gi| 173899741 


1 


gij 13559287 1 


51 


NOV33 


78 


gi| 13374941 1 


58 


gi|4186183| 


301 


gij 173899741 


1 


gij 13559287 ( 


60 



NOV33 

gi [13374941 
gi [4186183 I 
gi I 17389974 
gij 13559287 



NOV33 


198 


gi| 133749411 


151 


gi [4186183 [ 


420 


gij 17389974 1 


1 


gij 135592871 


60 



NOV33 

gi [13374941 
gi [4186183 | 
gij 17389974 
gij 13559287 



NOV33 

gi I 13374941 
gi [4186183 | 
gi I 17389974 
gij 13559287 



NOV33 

gi [13374941 
gi 1418 6183 I 
gij 17389974 
gi j 13559287 
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260 



DHQAAKQP FPRPRFRi 



270 280 



50 



290 



300 




310 



320 



34 0 



350 



360 




330 

...,[....[.... [,...[....[.... [....[ 

[HPENKPSWSVPSPDWRAWWQRSLSLARANSGDQDYKYDSTSDDSNFL 13 7 

-QHPENKPSWSVPS- -PD WRAWWQRSLStARANSG 101 

SiSlLAEPSNPPPQDTLSVTLPALWSFLWGDYKGEEKDRAPGEKGEEKEEDE 360 

1 

60 



370 

....[....(. 

138 npprgwdhtapghrt: 

101 DQD^ 

361 DYPSEDIEGEDQED] 

1 

60 



390 




380 

[....[. ...[.. 

kdqpeydstdgegd 
istsddsnfln- pprg 

►EEEQAIiWFNGTTDN 



400 

.[....!.... |. 
5LWSVCSVTCX3NG 
DHTAPGHRTFETK 
DQGWIiAPGDWVFK 



410 420 
|. ...[.... I 
IqKRTRSCXSYACTA 197 
jQPEYDSTDGEGDW 150 
[SVSYD-YBPQKEW 419 

1 

60 



430 



450 




I- 



460 



470 



480 



440 

fTFSTAATE\^LAGSEEFNATKLFEVgTi::^CERWMSCKS EFLKK 257 

CGYAcIaTESRTCDRPNCPGIeB- -IfRTAATEVSLLAGS 208 

fRT@PCGYGcjATETRTCI)I,PSCPGTE3KD|LGI*PSEBWKIJ^ 478 

1 

60 



490 
I 

258 YMHKVMNDLPSCP< 
209 EEFNATKLFEVD* 

478 NATDMHDQD^ 

1 

60 



500 



510 520 530 540 

[....[ I I 

FRWKDASGPKEKLEigKPTARlCIR 317 
■AD^^I 268 
IPVsESeH 535 
'AD^^I 28 
60 




570 



580 



590 



600 




NOV33 


411 - 


-- 411 


gi[ 13374941 [ 


389 i 


g 391 


gi j4186183 [ 


656 1 


B 658 


gij 173 89974 [ 


149 1 


B 151 


gij 13559287 j 


60 - 


-- 60 



Table 33E lists the domain description from DOMAE^ analysis results against 
NOV33. This indicates that the NOV33 sequence has properties similar to those of other 
proteins known to contain these domains. 
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Table 33E. Domain Analysis of NOV33 

gnl|Smart |smart00209, TSPl, Thrombospondin type 1 repeats; Type 1 repeats in 
throtnbospondin-X bind and activate TGF-beta, 

CX>-Length = 51 residues, 98.0% aligned 

Score = 47.4 bits (111), Expect = 2e-06 

NOV33:168 GDWSLWSVCSVTCGNGNQKRTRSC GYACT--ATESRTCDRPNCP 209 (SEQ ID NO:406X) 

kll II Illlll 1 I III I III IKI k II ^ 

Sbjct:2 GBWSEWSPCSVTGGGGVQTRTRCCNPPPNGGGPCTGPDTETRACNEQPCP 51 (SEQ ID NO: 407) 



gnl|Pfam|pfam00090, tsp_l, Thrombospondin type 1 domain. 

C3>- Length = 48 residues, 100.0% aligned 
Score = 43.9 bits (102), Expect = 2e-05 

NOV33tl€8 GDWSLWSVCSVTCGNGNQKRTRSC GYACT- -ATESRTCDRPNC 208 (SEQ ID NO: 408) 

II II Illlll I + M + l I II I h I I 

Sbjctrl SPWSEWSPCSVTCGKGIRTRQRTCNSPAGGKPCTGDAQETBACMMDPC 48 (SEQ ID NO: 409) 

The thrombospondin type 1 repeat v^as first described in 1986 by Lawler 8l Hynes. It 
was found in the thrombospondin protein where it is repeated 3 times. Now a number of 
proteins involved in the complement pathway (properdin, C6, C7, C8A, C8B, C9) as well as 
extracellular matrix protein like mindin, F-spondin, SCO-spondin and even the 
circumsporozoite surface protein 2 and TRAP proteins of Plasmodium contain one or more 
instance of this repeat. It has been involved in cell-cell interaction, inhibition of angiogenesis 
and apoptosis. The intron-exon organization of the properdin gene confirms the hypothesis 
that the repeat might have evolved by a process involving exon shuffling. A study of 
properdin structure provides some information about the structure of the thrombospondin 
type I repeat. See InterPro IPR000884. 

The protein similarity information, expression pattern, cellular localization, and map 
location for the NOV33 protein and nucleic acid disclosed herein suggest that this novel 
intracellular thrombospondin domain containing protein-like protein may have important 
structural and/or physiological fiinctions characteristic of the novel intracellular 
thrombospondin domain containing protein family. Therefore, the nucleic acids and proteins 
of the invention are useful in potential diagnostic and therapeutic applications and as a 
research t<x>L These include serving as a specific or selective nucleic acid or protein 
diagnostic and/or prognostic marker, wherein the presence or amount of the nucleic acid or 
the protein are to be assessed. These also include potential therapeutic applications such as 
the following: (i) a protein therapeutic, (ii) a small molecule drug target, (iii) an antibody 
target (therapeutic, diagnostic, drug targeting/cytotoxic antibody), (iv) a nucleic acid useful in 
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gene therapy (gene delivery/gene ablation), (v) an agent promoting tissue regeneration in 
vitro and in vivOy and (vi) a biological defense weapon. 

The NOV33 nucleic acids and proteins of the invention have applications in the 
diagnosis and/or treatment of various diseases and disorders. For example, the compositions 
of the present invention will have efficacy for the treatment of patients suffering from: 
systemic lupus erythematosus, autoimmune disease, asthma, emphysema, scleroderma, 
allergy, ARDS; fertility, hypogonadism; immunological disease and disorders as well as 
other diseases, disorders and conditions. 

These antibodies may be generated according to methods known in the art, using 
prediction from hydrophobicity charts, as described in the "Anti-NOVX Antibodies" section 
below. The disclosed NOV33 protein has multiple hydrophilic regions, each of which can be 
used as an immunogen. In one embodiment, a contemplated NOV33 epitope is from about 
amino acids 10 to 40. In another embodiment, a contemplated NOV33 epitope is from about 
amino acids 55 to 60. In other specific embodiments, contemplated NOV33 epitopes are 
from about amino acids 90 to 102, 110 to 140, 145 to 155, 190 to 195, 202 to 205, 240 to 
255, 260 to 305, 330 to 360 and 370 to 405. 

NOV34 

One NOVX protein of the invention, referred to herein as NOV34, includes three 
Ornithine Decarboxylase-Iike proteins. The disclosed proteins have been named NOV34a, 
NOV34b and NOV34c. 

NOV34a 

A disclosed NOV34a (designated CuraGen Acc. No. CG57258-01), which encodes a 
novel Ornithine Decarboxylase-4-'like protein and includes the 1463 nucleotide sequence 
(SEQ ID NO: 101) is shown in Table 34A. An open reading frame for the mature protein was 
identified beginning with an ATG initiation codon at nucleotides 51-53 and ending with a 
TGA stop codon at nucleotides 1413-1415. Putative untranslated regions are underlined in 
Table 34A, and the start and stop codons are in bold letters. 

Table 34A. NOV34a Nucleotide Sequence (SEQ ID NOrlOl) 

GGCGGCTGCAGCAGCGGCrCCATCCAGCCCGTCAGCTCCTCCTGCAAQQCA TQGCTGGCTACCTGAGTGAATCGGA 

CTTTGTGATGGTGGAGGAGGGCTTCAGTACCCGAGACCTGCTGAAGGAACTCACTCTGGGGGCCTCACAGGACGAG 

GTAGCTGCCTTCTTCGTGGCTGACCTGGGTGCCATAGTGAGGAAGCACTTTTGCTTTCTGAAGTGCCrrGCCACCT 

TCCGGCCCTTTTATGCTGTCAAGTGCAACAGCAGCCCAGGTGTGCTGAAGGTTCTGGCCCA6CTGGGGCTGGGCTT 

TAGCTGTGCCAACAAGGCAGAGATGGAGTTGGTCCAGCATATTGGAATCCCTGCCAGTAAGATCATCTGCGCCAAC 
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CCCTGTAAGCAAATTGCACAGATCAAATATGCTGCCAAGCATGGGATCCAGCTGCTGAGCTTTGACAATGAGA 

AGCTGGCAAAGGTGGTAAAGAGCCa.CCCCAGTGCCAAGATGGTTCTGTGCATTGCTACCGATGACTCCCACTCCCT 

GAGCTGCCTGAGCCTAAAGTTTGGAGTGTCACTGAAATCCTGCAGACACCTGCTTGAAAAT6CGAAGAAGCACCAT 

GTGGAGGTGGTGGGTGTGAGTTTTCACATTGGCAGTGGCTGTCCTGACCCTCAGGCCTATGCTCAGTCCATCGCAG 

ACGCCCGGCTCGTGTTTGAAATGGGCACCGAGCTGGGTCACAAGATGCACGTTCTGGACCTTGGTGGTGGCTTCCC 

TGGCACAGAAGGGGCCAAAGTGAGATTTGAAGAGATTGCTTCCGTGATCAACTCAGCCTTGGACCTGTACTTCCCA 

GAGGGCIX3TGGCGTGGACATCrTTGCTGAGCTGGGGCGCTACTACGlX3ACCrrCGGCCTTC^ 

TCATTGCCAAGAAGGAGGTTCTGCTAGACCAGCCTGGCAGGGAGGAGGAAAATGGTTCCACCTCCAAGACCATC^ 

GTACCACCTTGATGAGGGCGTGTATGGGATCTTCAACTCAGTCCTGTTTGACAACATCTGCCCTACCCCCATCCTG 

CAGAAGAAACCATCCACGGAGCAGCCCCTGTACAGCAGCAGCCTGTGGGGCCCGGCGGTTGATGGCTGTGATTGCG 

TGGCTGAGGGCCTGTGGCTGCCGCAACTACACGTAGGGGACTGGCTGGTCTTTGACAACATGGGCGCCTACA 

GGGCATGGGTTCCCCCTTTTGGGGGACCCAGGCCTGCCACATCACCTATGCCATGTCCCGGGTGGCCTGGCGAAGG 

CAGCTGATGGCTGCAGAACAGGAGGATGACGTGGAGGGTGTGTGCAAGCCTCTGTCCTGCGGCTGGGAGATCACAG 

ACAggCTGTGCGTGGGCCCTGTCTTCACCCCAGCGAGCATCATGTG AGTGGGCCTCGTTCCCCCCGGAGAATCCCA 

GCGGGGCCTCAGAGATGCA 



The disclosed NOV34 nucleic acid sequence maps to chromosome 1 and has 948 of 
1373 bases (69%) identical to a gb:GENBANK-ID:AF217544|acc:AF2 17544.2 mRNA from 
Xenopus laevis (Xenopus laevis ornithine decarboxylase-2 mRNA, complete cds) (E = 9.8e* 



The NOV34 polypeptide (SEQ ID NO: 102) is 454 amino acid residues in length and 
is presented using the one-letter amino acid code in Table 34B. The SignalP, Psort and/or 
Hydropathy results predict that NOV34a does not have a signal peptide and is likely to be 
localized to the cytoplasm with a certainty of 0.4500. In alternative embodiments, a NOV34 
polypeptide is located to the microbody (peroxisome) with a certainty of 0.4387, the 
mitochondrial matrix space with a certainty of 0.1000, or the lysosome (lumen) with a 
certainty of 0.1 000. 



Table 34B. Encoded NOV34a Protein Sequence (SEQ ID NO:102) 



MAGYLSESDFVMVEEGFSTRDLLKELTLGASQDEVAP^FVADLGAIVRKHFCFLKOLPRVRPF^ 

VLKVIAQLGLGFSCANKAEMELVQHIGIPASKIICANPCKQIAQIKYAAKHGIQLLSFDNEMEIAK^ 

AKMVLCIATDDSHSLSCLSLKFGVSLKSCRHLLENAKKHEIVEVVGVSFHIGSGCPDPQAYAQSIADARIOTE^ 

GTELGHKMHVLDIiGGGFPGTEGAKVRFEE I AS VINSALDLYFPEGCGVDI FAELGRYYVTSAFTVAVS 1 1 AKK 

EVLLIX3PGREEENGSTSKTIVYHIJ>EGVYGIFNSVLFDNICPTPILQKKPSTEQPLYSSSLWGPAVDGC]^^ 

EGLWLPQLHVGDWLVFDNMGAYWGMGSPFWGTQACHITYAMSRVAWRRQLMAAEQEDDVEGVCKPLS^^ 

TDTLCVGPVFTPASIM 



A disclosed NOV34a amino acid sequence was found to have 277 of 456 amino acid 
residues (60%) identical to, and 353 of 456 amino acid residues (77%) similar to, the 456 
amino acid residue ptnr:SPTREMBL-ACC:Q9I8S4 protein from Xenopus laevis (African 
clawed frog) (ORNITHINE DECARBOXYLASE-2) (E - 3,4e-^^^). 

NOV34a is expressed in at least the following tissues: Bone Marrow, Lymph node. 
Prostate, Right Cerebellum, and Substantia Nigra. Expression information was derived from 
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the tissue sources of the sequences that were included in the derivation of the sequence of 
NOV34. 

NOV34b 

A disclosed NOV34b (designated CuraGen Acc. No. CG57258-02), which encodes a 
novel Ornithine Decarboxylase-like protein and includes the 1613 nucleotide sequence (SEQ 
ID NO: 103) is shown in Table 34C. An open reading frame for the mature protein was 
identified beginning with an ATG initiation codon at nucleotides 42-44 and ending with a 
TGA stop codon at nucleotides 1248-1250. Putative untranslated regions are underlined in 
Table 34C, and the start and stop codons are in bold letters. 



Table 34C. NOV34b Nucleotide Sequence (SEQ ID NO:103) 

AGCAGCGGCTCCATCCAGCCCGTCAGCTCCTCCTGCAAGGC ATQGCTGGCTACCTGAGTGAATCGGACTTTGTGA 
TGGTGGAGGAGGGCTTCAGTACCCGAGACCTGCTGAAGGAACTCACTCTGGGGGCCTCACAGGCCACCACGGCAG 
AGATGGAGTTGGTCCAGCATATTGGAATCCCTGCCAGTAAGATCATCTGCGCCAACCCCTGTAAGCAAATTGCAC 
AGATCAAATATGCTGCCAAGCATGGGATCCAGCTGCTGAGCTTTGACAATGAGATGGAGCTGGCAAAGGTGGTAA 
AGAGCCACCCCAGTGCCAAGATGGTTCTGTGCATTGCTACCGATGACTCCCACTCCCTGAGCTGCCTGAGCCTAA 
AGTTTGGAGTGTCACTGAAATCCTGCAGACACCTGCTTGAAAATGCGAAGAAGCACCATGTGGAGGTGGTGGGTG 
TGAGTTTTCACATTGGCAGTGGCTGTCCTGACCCTCAGGCCTATGCTCAGTCCATCGCAGACGCCCGGCTCGTGT 
TTGAAATGGGCACCGAGCTGGGTCACAAGATGCACGTTCTGGACCTTGGTGGTGGCTTCCCTGGCACAGAAGGGG 
CCAAAGTGAGATTT6AAGAGATTGCTTCCGTGATCAACTCAGCCTTGGACCTGTACTTCCCAGAGGGCTGTGGCG 
TGGACATCTTTGCTGAGCTGGGGCGCTACTACGTGACCTCGGCCTTCACTGTGGCAGTCAGCATCATTGCCAAGA 
AGGAGGTTCTGCTAGACCAGCCTGGCAGGGAGGAGGAAAATGGTTCCACCTCCAAGACCATCGTGTACCACCTTG 
ATGAGGGCGTGTATGGGATCTTCAACTCAGTCCTGTTTGACAACATCTGCCCTACCCCCATCCTGCAGAAGAAAC 
CATCCACGGAGCAGCCCCTGTACAGCAGCAGCCTGTGGGGCCCGGCGGTTGATGGCTGTGATTGCGTGGCTGAGG 
GCCTGTGGCTGCCGCAACTACACGTAGGGGACTGGCTGGTCTTTGACAACATGGGCGCCTACACTGTGGGCATGG 
GTTCCCCCTTTTGGGGGACCCAGGCCTGCCACATCACCTATGCCATGTCCCGGGTGGCCTGGGAAGCGCTGCGAA 
GGCAGCTGATGGCTGCAGAACAGGAGGATGACGTGGAGGGTGTGTGCAAGCCTCTGTCCTGCGGCTGGGAGATCA 
CAGACACCCTGTGCGTGGGCCCTGTCTTCACCCCAGCGAGCATCATGTG AGTGGGCCTCGTTCCCCCCGGAGAAT 
CCCAGCGGGGCCTCAGAGATGCATCTGGGAGAGGTGGGGAAGATGGCAGGCAAGGGTACCCTTGGCCAGGACTCT 
GGTGCCCACCCTGCCACCCCCGCGCTCCACCTGCAGTGTTTCTGCCCTGTAAATAGGACCAGTCTTACACTCGCT 
GTAGTTCAAGTATGCAACATAAATCCTGTTCCTTCCAGCTGTGTCTGCCTCCTCTGCAGTGCAAGGGGCCTGGTC 
AGCCAGGTGTGGGGGTGTTCTTGGGGTCTCCTTTGGTCTCCTTCCCACCTTTGTAAATATAATGCAAATAAATAA 
ATATTTAGGTTTTTAAAAACTGAAAAAAAAAAAAAAAA 



The disclosed NOV34b nucleic acid sequence maps to chromosome 1 and has 1482 of 
1489 bases (99%) identical to a gb:GENBANK-ID:BC010449|acc:BC01 0449.1 mRNA from 
Homo sapiens (Homo sapiens, Similar to ornithine decarboxylase 1, clone MGC:18232 
IMAGE:41 56927, mRNA, complete cds) (E =0.0). 

A disclosed NOV34b polypeptide (SEQ ID NO: 104) is 402 amino acid residues in 
length and is presented using the one-letter amino acid code in Table 34D. The SignalP, 
Psort and/or Hydropathy results predict that NOV34b does not have a signal peptide mid is 
likely to be localized to the cytoplasm with a certainty of 0.4500. In alternative 
embodiments, a NOV34b polypeptide is located to the microbody (peroxisome) with a 
certainty of 0.4154, the mitochondrial matrix space with a certainty of 0.1000 or the 
lysosome (lumen) with a certainty of 0.1 000. 
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Table 34D. Encoded NOV34b Protein Sequence (SEQ ID NO:104) 

MAGYLSESDFVMVEEGFSTRDLIJCBLTLGASQATTAEMELVQHIGIPASKIICa^ 

DNEMELAKVVKSHPSAKMVLCIATDDSHSLSCLSLKFGVSLKSC^ 

QSIM>J^VFEMGTBLGHKMHVIiDLGGGFPGTEGAKVRFEEIASVINSAIJ5I^^ 

VAVSIIAKKEVLII)QPGREEENGSTSKTIVYHLDEGVYGIFNSVLFDNICPTPILQiCKPST^ 

GCDCVJ^GLWLPQLHVGDWLVFDNMGAYTVGMGSPFWGTQACailTYMSRVAT^^ 

SCGWEITDTI>CVGPVFTPASIM 

The NOV34b amino acid sequence was found to have 373 of 381 amino acid residues 
(97%) identical to, and 375 of 381 amino acid residues (98%) similar to, the 460 amino acid 
residue ptnr:TREMBLNEW-ACC:AAH 10449 protein from Homo sapiens (Human) 
(SIMILAR TO ORNITHINE DECARBOXYLASE 1) (E = 4.1 e'^^^). 

NOV34b is expressed in at least the following tissues: Brain, Lung, Heart, Pineal 
Gland, Colon, Peripheral Blood, Lymphoid tissue. Bone Marrow, Lymph node, Prostate, 
Right Cerebellum, and Substantia Nigra. Expression information was derived from the tissue 
sources of the sequences that were included in the derivation of the sequence of CuraGen 
Acc. No. CG57258-02. The sequence is also predicted to be expressed in the Brain because 
of the expression pattern of (GENBANK-ID: gbrGENBANK- 

ID:BC010449|acc:BC0 10449.1), a closely related Homo sapiens. Similar to ornithine 
decarboxylase 1, clone MGC:18232 IMAGE:41 56927, mRNA, complete cds homolog in 
species Homo sapiens . 

NOV34C 

A disclosed NOV34c (designated CuraGen Acc. No. CG57258-03), which encodes a 
novel Ornithine Decarboxylase-like protein and includes the 679 nucleotide sequence (SEQ 
ID NO: 105) is shown in Table 34E. An open reading frame for the mature protein was 
identified beginning with an ATG initiation codon at nucleotides 23-25 and ending with a 
TGA stop codon at nucleotides 677-679. Putative untranslated regions are underlined in 
Table 34E, and the start and stop codons are in bold letters. 

Table 34E. NOV34c Nucleotide Sequence (SEQ ID NO:105) 

CCGTCAGCTCCTCCTGCAAGGCA TGGCTGGCTACCTQAGCGAATCGGACTTTGTGATGGTGGAGGAGGGCTTCA 
GTACCCGAGACCTGCTGAAGGAACTCACTCTGGGGGCCTCACAGGCCACCACGGACGAGGTAGCTGCCTTCTTC 
GTGGCTGACCTGGGTGCCATAGTGAGGAAGCACTTTTGCTTTCTGAAGTGCCTGCCACGAGTCCGGCCCTTTTA 
TGCTGTCAAGTGG^ACAGCAGCCCAGGTGTGCTGAAGGTTCTGGCCCAGCTGGGGCTGGGCTTTAGCTGTGCCA 
ACATCTGCCCTACCCCCATCCTGCAGAAGAAACCATCCACGGAGCAGCCCCTGTACAGCAGCAGCCTGTGGGGC 
CCGGCGGTTGATGGCTGTGATTGCGTGGCTGAGGGCCTGTGGCTGCCGCAACTACACGTAGGGGACTGGCTGGT 
CTTTGACAACATGGGCGCCTACACTGTGGGCATGGGTTCCCCCTTTTGGGGGACCCAGGCCTGCCACATCACCT 
ATGCCATGTCCCGGGTGGCCTGGGAAGCGCTGCGAAGGCAGCTGATGGCTGCAGAACAGGAGGATGACGTGGAG 
GGTGTGTGCAAGCCTCTGTCCTGCGGCTGGGAGATCACAGACACCCTGTGCGTGGGCCCTGTCTTCACCCCAGC 
GAGCATCATGTOA 
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The disclosed NOV34c nucleic acid sequence maps to chromosome 1 and has 388 of 
390 bases (99%) identical to a gb:GENBANK-ID:BC0104491acc:BC010449.1 mRNA from 
Homo sapiens (Homo sapiens. Similar to ornithine decarboxylase 1, clone MGC: 18232 
IMAGE:4 156927, mRNA, complete cds) (E = 2.3e'^^^). 

A disclosed NOV34c polypeptide (SEQ ID NO: 106) is 218 amino acid residues in 
length and is presented using the one-letter amino acid code in Table 34F. The SignalP, Psort 
and/or Hydropathy results predict that NOV34c does not have a signal peptide and is likely to 
be localized to the microbody (peroxisome) with a certainty of 0.4748. In alternative 
embodiments, a NOV34C polypeptide is located to the cytoplasm with a certainty of 0.4500, 
the mitochondrial matrix space with a certainty of 0.1000, or the lysosome (lumen) with a 
certainty of 0.1 000. 



Table 34F. Encoded NOV34c Protein Sequence (SEQ ID NO: 106) 

MAGYLSESDFVMVEEGFSTRDLLKELTLGASQATTDEVAAFFVADLGAIVRKHFCFLKCLPRVRPFYAVKC^ 

GVI.KVIiAQLGLGFSCANICPTPILQKKPSTEQPLYSSSLWGPAVDGCDCVAEGLWLPQIiHVGDWLVFD^ 

GMGSPFWGTQACHITYAMSRVAWEALRRQLMAAEQEDDVEGVCKPLSCGWEITDTLCVGPVFTPASIM 



The NOV34c amino acid sequence was found to have 127 of 127 amino acid residues 
(100%) identical to, and 127 of 127 amino acid residues (100%) similar to, the 460 amino 
acid residue ptnr:TREMBLNEW-ACC:AAH10449 protein from Homo sapiens (Human) 
(SIMILAR TO ORNITHINE DECARBOXYLASE 1) (E = 9.1e "*). 

NOV34c is expressed in at least the following tissues: Brain, Lung, Heart, Pineal 
Gland, Colon, Peripheral Blood, Lymphoid tissue. Bone Marrow, Lymph node. Prostate, 
Right Cerebellum, and Substantia Nigra. Expression information was derived from the tissue 
sources of the sequences that were included in tiie derivation of the sequence of CuraGen 
Acc, No. CG57258-03. The sequence is predicted to be expressed in the brain because of the 
expression pattern of (GENBANK-ID: gb:GENBANK-ID:BC010449|acc:BC0 10449.1) a 
closely related Homo sapiens. Similar to ornithine decarboxylase 1, clone MGC: 18232 
IMAGE:41 56927, mRNA, complete cds homolog in species Homo sapiens. 

Homologies to any of the above NOV34a, NOV34b and NOV34c proteins will be 
shared by other NOV34 proteins insofar as they are homologous to each other as shown 
below. Any reference to NOV34 is assumed to refer to NOV34a, NOV34b and NOV34c 
proteins in general, unless otherwise noted. 
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NOV34a, NOV34b and NOV34c are very closely homologous as is shown in the 
amino acid alignment in Table 34G. 

Table 34G. ClustalW of NOV34a, NOV34b and NOV34c 



NOV34a 
NOV34b 
NOV34C 



NOV34a 
NOV34b 
NOV34C 



NOV34a 
NOV34b 
NOV34C 



NOV34a 
NOV34b 
NOV34C 



NOV34a 
NOV34b 
NOV34C 



NOV34a 
N0V34b 
NOV34C 




40 50 

|....l....t....| 

^S8dEVAAFFV7UDLGAIVRKH 50 

32 

32 



60 



70 



80 



90 



100 



FCFLKCLPRVRPFYAVKCNSSPGVLKVLAQLGLGFSCANK 



AEMELVQHIG 



100 

45 

34 
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120 130 



140 



150 



I PAS KI I CANP C KQ I AQ I KYAAKKG I QLLS 
IPASKIICANPCKQIAQIKYAAKHGIQLLS 



I 



-TDl^aAFFVi 

160 170 




180 



190 



200 



LCIATDDSHSLSCLSLKFGVSLKSCRKLLENAKKHHVEWGVSFHIGSGC 

:lc iatddshs lsclslkfgvslkscrkllenakkhhvewgvsfhigsgc 




260 
I I . . 



270 
. . I . , 



280 



290 
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NOV34a 
NOV34b 
NOV34C 



NOV34a 
NOV34b 
NOV34C 



NOV34a 
NOV34b 



svinsaldlyfpegcgvdifaelgryyvtsaftvavsiiakkevlldqpg 



reeengstsktivyhldegvygifnsvlfdnicptpilqkkpsteqplys 
reeengstsktivyhldegvygifnsvlfdn icptpilqkkpstsqplys 
HBIHI^BBI^BBI^^^BB^B^E^icptpilqkkpsteqplys 



sslwgpavdgcdcvaeglwlpqlhvgdwlvfdnmgaytvgmgspfwgtqa 
sslx^gpavdgcdcvaeglwlpqlhvgdwlvfdnmgaytvgmgspfwgtqa 
sslwgpavdgcdcvaeglwlpqlhvgdwlvfdnmgaytvgmgspfwgtqa 



chityamsrvaw^BR^Ql^^^Q^^^^^^^^^^^^'^^^"^^^'^^'^^*^^^^^ 

chityamsrvawsalrrqlmaasqsddvsgvckplscgwsitdtlcvgpv 
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NOV34C 



211 



NOV34a 
NOV34b 
NOV34C 




NOV34a also has homology to the amino acid sequences shown in the BLASTP data 
listed in Table 34R 



Table 34H. BLAST results for NOV34a 


Gene Index/ 
Identifier 


Protein/ Organism 


Length 
(aa) 


Identity 
{%) 


Positives 
{%) 


Expect 


gi 1 16506287 |ref| 
NP_443724 .l| 
(NM_052998) 


hypothetical 
protein 
XP_054282 ; 
hypothetical gene 
supported by 
BCO 10449; ODC- 

paralog [Homo 
sapiens] 


460 


454/460 
(98%) 


454/460 
(98%) 


0.0 


gi 1 17444708 [reft 
XP_054282 .2 1 
(XM_054282) 


similar to 
ornithine 
de c ar boxy 1 a s e - 
like protein 
variant 2 (H. 
sap i ens > [Homo 
sapiens] 


480 


454/480 
(94%) 


454/480 
(94%) 


0.0 


gi| 16552627 |dbj | 
BAB71356.1 1 
(AK057051) 


unnamed protein 
product [Homo 
sapiens] 


365 


362/365 
(99%) 


362/365 
(99%) 


0.0 


gi [15858869 |gb|A 
AL08052.l| 
(AY050637) 


ornithine 
decarboxylase- 

lilce protein 
variant 3 [Homo 
sapiens] 


362 


343/354 
(96%) 


343/354 
(96%) 


0.0 


gij 15858867 tgb|A 
AL08051.l| 
(AY050636) 


ornithine 
de c arboxy 1 a s e - 

lilte protein 
variant 4 [Homo 
sapiens] 


374 


343/366 
(93%) 


343/366 
(93%) 


0.0 



The homology of these sequences is shown graphically in the ClustalW analysis 
shown in Table 341. 
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Table 341. ClustalW Analysis of NOV34 



1) N0V34a (SEQ 

2) NOV34b (SEQ 

3) NOV34C (SEQ 

4) gi I 16506287 (SEQ 

5) gi 117444708 (SEQ 

6) gi 1 16552627 (SEQ 

7) gi 115858869 (SEQ 

8) gi I 15858867 (SEQ 



ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 



:102) 
:104) 
:106) 
:410) 
:411) 
:412) 
:413) 
:414) 



NOV34a 
NOT34b 
N0V34C 
gi 1 16506287 I 
gi 1 17444708 I 
gi I 16552627! 
gi 1 15858869 1 
gi 1 15858867 I 



NOV34a 
NOV34b 
NOV34C 
gi 1 16506287 | 
gij 17444708 I 
gi j 16552627 I 
gi 1 15858869 I 
gij 15858867 | 



gi 1 16506287 I 
gij 17444708 I 
gi i 16552627 j 
gij 15858869] 
gij 15858867} 



NOV34a 
NOV34b 
NOV34C 
gi 1 165062871 
gij 17444708 I 
gij 16552627 j 
gij 15858869 1 
gij 15858867 j 



NOV34a 
NOV34b 
NOV34C 
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MAGyLSESDFVMVBEGPSTRDLLKBLTLGASQATTAEMEj 



16 
60 
16 
16 
16 
23 
16 
16 
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16 

61 AQIK jAjHOgrjEFDN EMEIAKW: 

16 

16 

16 

24 AQI 
16 

16 



FSTRDLLKELTI 



FSTRDLLKELTLG 
FSTRDLLKELTLG 
FSTRDLLKELTLG 





FDNEME] 



130 



140 



150 



16 0 



170 



180 



NOV34a 


61 


NOV34b 


121 


NOV34C 


64 


gi| 16506287 | 


64 


gij 17444708} 


64 


gij 16552627] 


84 


gij 158588691 


64 


gi i 15858867 j 


64 


NOV34a 


89 


N0V34b 


149 


NOV34C 


92 



R PFYAVKCNS S PGV LKV LAQL GLG FSCAN 
i ^iiiAHaH^V4i<iiit^jsiv Sgv ^a;^g G5Gl 

R? FYAVKCNS S PGVLKVLAQLGLGFSCAJSi 
RP FYAVKCNS S PGVLKVLAQLGLG FS CAN 
IRP FYAVKCNS S PGVLKVL AQLGLGFS CAN 



89 

149 

92 

KAEMEIiVQHIGIPASKIICANPCKQIAQIKY 123 
KAEMEIiVQHIGIPASKIICANPCKQIAQIKY 123 
112 



RP FYAVKCNS SPGVLKVLAQLGLGFS CAN 
" PFYAVKCNSSPGVLKVLAQLGLGFSCAN 



92 
92 



190 



200 



210 



22 0 



230 



240 
.-I 



89 

149 

92 

124 AAKHGIQLLSFDNEMEIJ^KNAnCSHPSAKMVLCIATDDSHSLSCLSLKFGVSLKSCRHLLE 183 
124 AAKHGIQLI.SFDNEMEIAKWKSHPSAKB«VI.CIATDDSHSLSCLSLKF^^ 183 

112 

92 

92 



112 

92 

92 



300 




89 
149 
92 

184 NAKKHHVEWGVSFHIGSGCPDPQAYAQSIj 
184 NAKKHHVEWGVSFHIGSGCPDPQAYAQSI 

112 AYAQSlI 

92 
92 




►LGGGFPGTE 185 
92 

ELGHKIhVLDLGGGFPGTB 243 
DLGGGFPGTE 243 
iDLGGGFPGTE 148 
PCKQIAQIKY 123 
PCKQIAQIKY 123 



310 



320 



330 




340 

►NEMeBaKVVKSHPSAK- MVLi 

TNSA5>I*YFPEGCGVDIFAgLgR-YYfTS; 



360 




350 

iDSHSLSCgSLlF 168 
'SXIAKKBv8ld|p 244 
92 



262 



gi 1 16506287 I 
gi j 17444708 j 
gi [165526271 
gi[l5858869| 
gi 158588671 



NOV34a 
N0V34b 
NOV34C 
gi I 16506287 
gi I 17444708 
gi I 16552627 
gi I 15858865 
gi 1 15858867 



NOV34a 

NOV34b 

NOV34C 

gi 116506287 I 

gi|l7444708| 

gi 116552627 1 

gi 115858869 I 

gi 1 15858867 I 




>I,YFPEGCGVDIK 
>I,yFPEGC6VDIF. 

>lyfpegcgvdif: 

KSHPSAK 

[HPSAKFV!C^|RgTA( 
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HOV34a 


169 


HOV34b 


245 


NOV34C 


92 


gi 116506287] 


303 


gij 17444708 1 


303 


gij 16552627] 


208 


gij 15858869) 


172 


gij 15858867] 


184 




207 AQSIADi 
281 
97 
339 
359 
244 

210 AQSIADi 
222 AQSIADi 



270 VDIFAELGP 
282 VDIFS 



550 



560 



580 



590 



600 



llOV34a 


327 


N0V34b 


385 


N0V34C 


201 


gi] 16506287] 


443 


gi] 17444708] 


463 


gi 116552627] 


348 


gijl5858869 j 


330 


gij 158588671 


342 




570 

.]....l....t....|.,..t..^.l....| 

■QKKPSTEQPLYSSSLWGPAV DGCDC VAEGLWLPQLHVGDWLVFDNMGA 386 

402 

218 

460 

480 

365 

FIAV 362 

FIAV 3 74 



.QK- SKNHSPCYMS - 
,QK-SKNHSPCYMS- 




610 



620 



630 



64 0 



650 



660 



NOV34a 


387 


NOV34b 


402 


NOV34C 


218 


gi] 16506287 


460 


gij 17444708 


480 


gij 16552627 


365 


gij 15858869 


362 


gij 15858867 


374 



387 yXVGMGSPFWGTQACHITYAMSRVAWRRQLMAAEQEDDVEGVCKPLSCGWEITDTLCVGP 446 

402 

218 

460 

480 

365 

--- 362 

374 



.1- 



NOV34a 


447 VFTPASIM 


454 


NOV34b 


402 


402 


NOV34C 


218 


218 


gi] 16506287 


460 


460 


gi] 17444708 


480 


480 


gij 16552627 


365 


365 


gi| 15858869 


362 


362 


gi 115858867 


374 


374 
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Tables 34J and 34K list the domain description from DOMAIN analysis results 
against NOV34. This indicates that the NOV34 sequence has properties similar to those of 
other proteins known to contain these domains. 



Table 34 J Domain Analysis of NOV34a 

gnl |pfam|pfam02784, Orn_Arg_deC_N, Pyridoxal- dependent decarboxylase, pyridoxal 
binding domain. These pyridoxal -dependent decarboxylases acting on ornithine, 
lysine, R and related substrates This domain has a TIM barrel fold. 

CD-Length = 246 residues, 99.2% aligned 

Score = 248 bits (634), Expect = 4e-67 



NOV34 : 


42 


DI^AIV-RKHFCFLKCLPRWPFYAVKCNSSPGVLKVIAQLGLGFSCANKAEMELVQHIG 


100 






III III 1 + llh+llim II 1 ll++lk!l 11 ll + l l + l 1 G 




Sbjct: 


1 


DliGLXVRRIHAIA^QAFLPRIQPFYAVKANSDPAVLRIJjAELGTGFI^^ 


60 


N0V34 : 


101 


I PASKI I CANPCKQI AQIKYAMOIGIQIJjSFDNEMBIJUCVVKSHPSAKMVLCI ATDD 


160 






-t-l +1111111 -^+++11 +11+ ++ II 111+ + 1 I+++I + I 




Sbjct: 


61 


VPPERI IFANPCKDRSBLRYALEHGWCVTVDNVEELEKIJtflLAPEliRLLI^ 


120 



NOV34: 161 LSC LSLKFGVSLKSCRHLLENAKKHHVEWGVSFHIGSGCPDPQAYAQSIADARI, 215 

I III 1+ 11+ 11+ + till ll+llll I +1+ ++ 111 

Sbjct: 121 AHCYLSTGQDSKFGADLEEAEALLKAAKBLGIiNVVGVHFHVGSGCTDABAFVKTi?^ 180 

NOV34: 216 VFEMGT-ELGHKMHVLDLGGGFPGTEGAKVRFEEIASVINSALDLYFPEGCGVDIFAELG 274 

11+ I 111 ++ +1111111 III I 111+11+ II I II 1 

Sbjct: 181 VFDQGADELGFELKIIJDIX3GGFGVDYTGAEDFEEXAEVI11AALBBVFPHDPHPTIIAEPG 240 



NOV34: 275 RYYV 278 (SBQ ID NO:415) 

II i 

Sbjct: 24X RYIV 244 (SEQ ID IK>:416) 



gnl |Pf amjpf am00278, Orn_DAP_Arg_deC, Pyridoxal -dependent decarboxylase, C- 
terminal sheet domain. These pyridoxal -dependent decarboxylases act on 
ornithine, lysine, R and related substrates, 

CD- Length = 119 residues, 89.9% aligned 

Score = 89.7 bits (221), Expect = 3e-19 

NOV34: 283 VAVSIIAKKEVLLDQPC^BEBNGSTSKTIVYHLDEGVYGIFNSVLFDNICPTPILQKKP 342 

h ++1111 I ++ I +I++++I II i + I +1 ++ 

Sbjct: 1 TLVSNVIAKKTV PSDDBDGKDDTRMYYVNDGGYSSPIRPLLYHAHPHALLLRRS 54 

NOV34: 343 STEQPLYSSSLWGPAVDGCDCVAEGLWLPQLHVGDWLVFDNMGAYTVGMGSPF 395 (SEQ ID 
NO:417) 

l + l ll+lll I I + + ll+l Mill I + MM 1 I I 
Sbjct: 55 LDEEPPRKSSIWGPTCDSLDKIIKDRLI^ELDVGDWLAFFDTGAYTEAMASNF 107 (SEQ ID 

NO;418) 
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Table 34K Domain Analysis of NOV34a 






Om DAP Arg_deC(InterPro)Pyridoxal- dependent decarboxylase 430.6 
1 


6.2e 


-128 


Parsed for 


domains : 






Model 




Domain seq seq limm hmm score 
from to from to 


E-value 


Om__DAP_Arg_deC 


1/1 38 398 1 467 [] 430.6 


6.2e 


-128 


Alignments of top- scoring domains: 

Om DAP Arg_deC: domain l of 1, from 38 to 398: score 430.6, E = 6 


.2e- 


128 


NOV34A 


38 


* - >f yvyDlglHivrrihalwkaf Iprgqynswkpf YAVKansdpavlr 
FFVADLG- - AIVRKHFCFLKCLPR VRPFYAVKCNSSPGVLK 


76 




NOV34A 


77 


ILaelGtHslGfDcaSkgELerVLaaylagvsPerlifanpcKsrselry 

II II l+l 1+1+ +I+++++II 
VIAQLGI* GFSCANKAH4ELVQH IGIPASKIICANPCKQIAQIKY 


120 




NOV34A 


121 


AlehrkMGgwcvtvDnveELekiaklapeaGvkprllLRvkpdvdahah 

1+ 1 I++ +11+ I +t 1 +++I l + l-^ 
AAKH GIQLLSFDNEMEIiAKWKSHPSA KMVLCI ATD- DSHSL 


161 




NOV34A 


162 


crl stGqedsKFGadledgedaealLkaAkelgnlnwGvhFHVGSgi sd 

^ M II 14.4.1+4. ++ 1 1 + 1}+ + + ++ 1 1 1 l + l l + l [ [++I 
+11 III ++ 1 ++ 11^ 11 1 1 1 1 ^ 1 1 ^ 1 M 1 

SCLSL KFGVSLKS CRHLLENAKKHH-VEWGVSFHXGSGCPD 


202 




NOV34A 


203 


leafvkAvrdarnvfdqgadelGfktidlkiLDiGGGfgvdytgtrsqSD 

++^+++•^11! Ih+I Ilk! + 

PQAYAQSIADARLVFEMGT-ELGHK MHVLDLGGGFPGTEGA 


242 




NOV34A 


243 


mSVaedf ee i Aevinaa 1 eel f phagygdpgpt i iaEPGRyivAaagtLv 

mihlll II lll + l +1 + 1 + 
KVRFEEIASVINSALDLYFPE GCGVDIFAELGRYYVTSAFTVA 


285 




NOV34A 


286 


snvi akkevp sddadtt sdslreeskDdt rmyyvnDggygsf irpl lyha 

+ +1111 II 1+ +++ + 1+ +1 I++ l+ll 1+ I+++ 
VS 1 1 AKKEVLLDQ — PGREEE -NGSTSKTI VYHLDEGVYGIFNS VLFDNI 


332 




NOV34A 


333 


hpeal 1 1 rrggevqyqdaeteraadkslsnFsLf qsyPdAwgidqLf Pvl 
I++ I+++ 


341 




NOV34A 


342 


PI r s 1 de epkrks s i vGp t CDsDGklDki ikddGi aedr 1 IPelkpvGDw 
+ 1 1 ++1I++II h I++++ + II 1+ INI 

PSTBQPLYSSSLWGPAVD6 CDCVAE GLWLPQLH- VGDW 


378 




NOV34A 


379 


La f pd t GAYt yamasnyNgF< - * ( SEQ ID NO : 4 1 9 ) 
1^1 +Hllh hi + 1 

LVFDNMGAYTVGMGSPFWGT 398 {SEQ ID NO:420) 







These enzymes are collectively knov^n as group IV decarboxylases. Pyridoxal- 
dependent decarboxylases acting on ornithine, lysine, arginine and related substrates can be 
classified into two different families on the basis of sequence similarities. Members of this 
family while most probably evolutionary related, do not share extensive regions of sequence 
similarities. The proteins contain a conserved lysine residue which is known, in mouse ODC, 
to be the site of attachment of the pyridoxal-phosphate group. The proteins also contain a 
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stretch of three consecutive glycine residues and has been proposed to be part of a substrate- 
binding region. See InterPro IPROOOl 83 and IPR002432, (Om_DAP_Arg_decarbxylse). 

Ornithine decarboxylase (ODC) is a key enzyme in polyamine biosynthesis. 
Turnover of ODC is extremely rapid and highly regulated, and is accelerated when polyamine 
levels increase (Murakami et aL, (2000). Biochem Biophys Res Commun 267(l):l-6, PMID: 
10623564). Expression and activity of ornithine decarboxylase directly correlates with the 
proliferation stage of cells. Ornithine decarboxylase is transcriptionally induced by tumor 
promoter TPA (Nguyen-Ba and Vasseur (1999). Oncol Rep 6(4):925-32. PMID: 10373683). 
It has also been shown to be transactivated by the c-myc oncogene in certain cell/tissue types 
and to cooperate with the ras oncogene in malignant transformation of epithelial tissues 
(Fuhrmann et al., (1999). Mutat Res 437(3):205-17. PMID: 10592328; Reddy (1999). J Nutr 
1129(7 Suppl):1478S-82S. PMID: 1039562; Nguyen-Ba and Vasseur (1999). Oncol Rep 
6(4):925-32. PMID: 10373683). Furthermore, inhibition of colon carcinogenesis was 
associated with a decrease in colonic mucosal cell proliferation and activities of colonic 
mucosal and tumor ornithine decarboxylase and ras-p21 (Reddy (1999). J Nutr 1 129(7 
Suppl):1478S-82S. PMID: 1039562;). The rationale for the inhibition of ornithine 
decarboxylase as a cancer chemopreventive agent has been strengthened in recent years. 
Recent clinical cancer chemoprevention trials have demonstrated that DFMO, which is an 
inhibitor of ornithine decarboxylase, can be given over long periods of time at low doses that 
suppress polyamine contents in gastrointestinal and other epithelial tissues but cause no 
detectable hearing loss or other side effects (Meyskens and Gerner (1999). Clin Cancer Res 
5(5):945-51. PMID: 10353725). Clinical chemoprevention trials are also in progress to 
investigate the efficacy of DFMO to suppress surrogate end point biomarkers (e.g., colon 
polyp recurrence) of carcinogenesis in patient populations at elevated risk for the 
development of specific epithelial cancers, including colon, esophageal, breast, cutaneous, 
and prostate malignancies (Meyskens and Gerner (1999). Clin Cancer Res 5(5):945-51. 
PMID: 10353725). Therefore, the novel ornithine decarboxylase described in this invention 
may serve as a potential small molecule drug target for therapeutic intervention. 

The protein similarity information, expression pattern, cellular localization, and map 
location for the NOV34 protein and nucleic acid disclosed herein suggest that this Ornithine 
Decarboxylase-like protein may have important structural and/or physiological functions 
characteristic of the Ornithine Decarboxylase family. Therefore, the nucleic acids and 
proteins of the invention are useful in potential diagnostic and therapeutic applications and as 
a research tool. These include serving as a specific or selective nucleic acid or protein 
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diagnostic and/or prognostic marker, wherein the presence or amount of the nucleic acid or 
the protein are to be assessed. These also include potential therapeutic applications such as 
the following: (i) a protein therapeutic, (ii) a small molecule drug target, (iii) an antibody 
target (therapeutic, diagnostic, drug targeting/cytotoxic antibody), (iv) a nucleic acid useful in 
gene therapy (gene delivery/gene ablation), (v) an agent promoting tissue regeneration in 
vitro and in vivo, and (vi) a biological defense weapon. 

The NOV34 nucleic acids and proteins of flie invention have applications fai the 
diagnosis and/or treatment of various diseases and disorders. For example, the compositions 
of the present invention will have efficacy for the treatment of patients suffering from: 
hemophilia, hypercoagulation, idiopathic thrombocytopenic purpura, autoimmune disease, 
allergies, immunodeficiencies, transplantation, graft versus host disease, lymphedema, 
allergies, fertility. Von Hippel-Lindau (VHL) syndrome, Alzheimer's disease, stroke, 
tuberous sclerosis, hypercalceimia, Parkinson's disease, Huntington's disease, cerebral palsy, 
epilepsy, Lesch-Nyhan syndrome, multiple sclerosis, ataxia-telangiectasia, leukodystrophies, 
behavioral disorders, addiction, anxiety, pain, neurodegeneration as well as other diseases, 
disorders and conditions. 

These antibodies may be generated according to methods known in the art, using 
prediction from hydrophobicity charts, as described in the "Anti-NOVX Antibodies" section 
below. The disclosed NOV34 protein has multiple hydrophilic regions, each of which can be 
used as an immunogen. In one embodiment, a contemplated NOV34 epitope is from about 
amino acids 7 to 10. In another embodiment, a contemplated NOV34 epitope is from about 
amino acids 15 to 20. In other specific embodiments, contemplated NOV34 epitopes are 
from about amino acids 25 to 30, 38 to 42, 55 to 70, 95 to 1 10, 148 to 1 50, 160 to 1 63 and 
170 to 190. 

NOV35 and NOV36 

Two proteins of the invention, referred to herein as NOV35, and NOV36 include 
Short Chain Dehydrogenase/Reductase-like proteins. 

NOV35 

A disclosed NOV35 (designated CuraGen Acc. No. CG57339-01), which encodes a 

novel Short-Chain Dehydrogenase/Reductase-like protein and includes the 2972 nucleotide 

sequence (SEQ ID NO: 107) is shown in Table 35A. An open reading frame for the mature 

protein was identified beginning with an ATG initiation codon at nucleotides 690-692 and 
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ending with a TGA stop codon at nucleotides 2970-2972. Putative untranslated regions are 
underlined in Table 35 A, and the start and stop codons are in bold letters. 



Table 3SA. NOV35 Nucleotide Sequence (SEQ ID NO:107) 

TTTTTCTTTTTTTTCGAGACGCAGTCTTGCTCTGTCGCCaGGCTGGGGTGCAGTGGCGCAGT^ 

GCAACCTCCATCTCCCGGGTTCAAGTGACTCTCCTTCCTCAGCCTCCCCACTTCAGTTTCTTTATCTGTCAAT 

TGTGGTTAGTGGGCTGTTAATGAAAATTATTAGGTCaAACATCTACTAAGTATCTOTCACAT^ 

CGTCAATTGGCCCTTTTCCTTCCCACTAGACAACTTGAGAAAGCTTCCTCCTAGCCTATAGCTACTCTTCCGT 

TCCACTTCTTGGTTTCCTGCTCTGATTGCCATGTTTTGTTCTCACAGAQGCT^GGAGAGGCAGGTCCGAG 

CGGGGTGACCCGGTCCAAGGCGGAAAAAGTGCGGCCGCCCACTGTGCCAGTGCCGC^^GGTGGATATCGTGCCT 

GGGCGGCrrCAGTGAGGCCGAGTGGATGGCGCTTACAGCCCTCGAGGAGGGCGftGGACGTCGTAGGGGACATCT 

TGGCCGACTTGCTGGCTCGAGTCATGGACTCTGCTTTCAAAGTCTACCTGACTCa^GCAGGTGGGCCGGGAT 

GGGTCCTTCAGACTCGTCTCCCTCTCCCGCCCCTCCCTGCCGACCTGAGATCCTCTCTCGCCTCCGCAGTGCA 

TTCCATTCACCATCAGCCa.GGCCCGGGAGGCCA TGCTGCAGATCACCGAGTGGCGCTTCCrGGCCC^^ 

GGGAGAATCTGCAGTAGCTGAGGACCCCACATGGGGTGA66ACGAGGAGCCTTCGGCATGCACGACGGACTCC 

TGGGCTCAGGGTTCAGTGCCCGTGCTGCACGCGTCCACCTCGGAGGGCCTGGAGAACTTCCAAGGCG^ 

ACTCCTCAGGAGCCTCTCCGGACTCCTCTGCCATTGCTCCTGCTCTCCCCTTTCaSACATCTCa^C^ 

TGCATTTCCCCAGGACCCTGGGGGCGTGGACCGGATCCCTTTAGGAAGGTCGTGGATGGGTCGAGGCTCCCAG 

GAGCAGATGGAATCTTGGGAGCCTTCTCCGCAGCTGAGAGTCACGTCGGCCCCTCCTCCCACATCAGAGCTGT 

TTCAGGAGGCAGGGCCCGGAGGTCCTGTAGAGGAAGCGGACGGCCAGTCTAGAG6CCTCTCCTCGGCCGGGTC 

CTGGTAGCCAGCCCCCAGGCCTCAACTGGGAGGGGAO^CCCCCTCGGCTTCCATTTGTCGTTGGAAGACCTCT 

ACTGTTGCATGCCTCAACTGGACGCGGCTGGGGATCGGCTGGAACTCAGGTCAGAGGGGGTGCCCTGCATCGC 

CTCGGGCGTGTTGGTGTCCTACCCCTCTGTGGGCGGCGCCACCCGCCCCTCCGCGTCCTGCCAGCAGCAGCGG 

GCCGGGCACTCGGATGTGCGGCTGAGCGCCCACCACCACy^GGATGCGCCGCAAGGCGGCCGTGAAACGCCTt^ 

ACCCTGCGAGGCTCCCGTGCCACTGGGTGCGCCCrCTGGCTGAGGTCCTGGTCCCAGACTCTCAAACACGCCC 

CTTGGAAGCCTACCGCGGACGCCAGCGGGGCGAGAAGACC3^GGCCCGGGCCGAACCCC»AGCCCTCGGCCCC 

GGCACCCGTGTCTCCCCGGCAGCGTTCTTCCCTCTCCGGCCAGGCATTCCTTTCCGTGACTTGGACTCGGGCC 

CCGCACTCCTGTTCCCCACTTTAAATTTAGGCCTATaSTOSCCATCCCrCGAGTCAAAGCTGC^ 

CTCCAGGATCCGCTTCCTCACCACACACCCGGTGCTCCCTGATGTGGCCCGCAGCCGCAGCCCCAAGCTGTGG 

CCCAGTGTCAGGTGGCCCAGCGGTTGGGAGGGGAAGGCCGAGCTGCTGGGCGAGCTGTGG6CTGGCCGGACCC 

GCGTGCCTCCACAGGGTCTGGAGCTGGCAGACS^GGGAGGGCCAGGATCCTGGCAGATGGCCTCGAA 

CCCGGTCCTTGAAGCCACTTCCCAGGTGATGTGGAAGCCCGTGTTGCTGCCAGAAGCCCTGAM 

GGTGTGAGCATGTGGAACCGGAGCACCCAGGTGTTGCrCAGCTCIXSGTGTGCCT^ 

GTAGCACCTTTCCrrCCCGTTGAGCAACATCCCaVTCCAGACAGGTGCCCCAAAGCCCAGCATTTCCCC^ 

CCCAGGAAGTTTCTGCTATGTTGCrGTGGGCTGCACTCAGCMCCTGGTC^ 

TATTCTGGTCTTCTTCAACTACATGTGCAGCTCTGGCAGAAGTCTCATCCCTG^ 

CAGATCTGACTGGGAAAATAGCCATAGTGACTGGGGCCAACaGTGGCATCGGGAAGGTTGTATCCCAGGACCT 

AGCTCGGTGTGGGGCCCAA6TGATCCTTACTTGTCAGAGC7VGGGAATGTGGACAGCAAGCCCTGGCTO 

CAAGCAGCCTCAAACAGCAACCGCCTCCTGCTTGGCGAGGTGGACCTTAGCTCXZATGAC 

TTGCCCGGAGGCTTCTACAGGAGAATCCTGAGATACAT<nX3CTGGTAAACAATGC^^ 

AAGACACTTACCCCAGGGGGCCTGGATCTCACCTTTGTCACTAACTATGTTGGGCCCTTO 

TACTCCAAGGATCTCAAACT^GGTGTACTCCCAGTCCTCrAOTTGAGCTTGGCAG^^ 

CTGGAAAATATTTCAGCAGTTCCTGTGTGATAACTCTTCCCGTTAAAGCCTCTCGQGATCCTCATGTTGCCCA 
GAGCCTCTGGAATGCCTCAGTCCGACTGACAAGCCTAGTCAAGATGGACTG& 



The disclosed NOV35 nucleic acid sequence maps to chromosome 2 and has 108 of 
126 bases (85%) identical to a gb:GENBANK-ID:HUMZB55G05|acc:AF086155.1 mRNA 
from Homo sapiens (Homo sapiens full length insert cDNA clone ZB55G05) (E = 7.4e"^^). 

A disclosed NOV35 polypeptide (SEQ ID NO: 108) is 760 amino acid residues in 
length and is presented using the one-letter amino acid code in Table 35B. The SignalP, 
Psort and/or Hydropathy results predict that NOV35 does not have a signal peptide and is 
likely to be localized to the mitochondrial matrix space with a certainty of 0.3600. In 
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alternative embodiments, a NOV35 polypeptide is located to the microbody (peroxisome) 
with a certainty of 0.3051 or the lysosome (lumen) with a certainty of 0,1000. 



Table 35B. Encoded NOV35 Protein Sequence (SEQ ID NO:108) 

MLQITEWRFIARDEGESAVAEDPTWGEDEEPSACTTDSWAQGSVPVIJIASTSEGLENFQGEra 

APALPFPTSHCPSAFPQDPGGVDRIPI/3RSWMGRGSQEQMESWEPSPQLRVTSAPPPTSELFQEAGPGGPVEEA 

IX3QSRGLSSAGSLSASFQLSVEEAPADDADPSLDPYLVASPQASTGRGHPLGFHLSLEDLYCCMPQLDAAGDRL 

ELRSEGVPCIASGVLVSYPSVGGATRPSASCQQQRAGHSDVRLSAHHHRMRRKAAVKRIJJPARLPC^^ 

VLVPDSQTRPLEAYRGRQRGEKTKARAEPQALGPGTRVSPAAFFPIjRPGIPFRDLDSGPALLFPTtiNLGLSSPS 

LESKLPLPNSRIRFLTTHPVLPDVARSRSPKLWPSVRWPSGWEGKAELLGELWAGRTRVPPQGLELADRBGQDP 

GRWPRTTPPVLEATSQVMWKPVLLPEALKLAPGVSMWNRSTQVLLSSGVPEQEDKEGSTFPPTO 

PSISPAGPGSFCYVAVGCTQHPGLGRWLCLPYSGLLQLHVQLWQKSHPWDLQCCSTDLTGKIAIVTGANSGIGK 

WSQDLARCGAQVILTCQSRECGQQAIJ^IQAASNSl^LLIX^EVDLSSMTSIRSFARRIiLQENPEIHIiL^^^ 

VSGFRRHLPQGAWISPLSLTMLGPFCSQIYSKDLKQGVLPVLYLSLAEEPGGISGKYFSSSCVim 

HVAQSIiWNASVRLTSLVKMD 

The NOV35 amino acid sequence was found to have 45 of 98 amino acid residues 
(45%) identical to, and 67 of 98 amino acid residues (68%) similar to, the 318 amino acid 
residue ptnr:SPTREMBL-ACC:Q9NRW0 protein from Homo sapiens (Human) 
(ANDROGEN-REGULATED SHORT-CHAIN DEHYDROGENASE/REDUCTASE 1) (E 
= 1.5e^^ 

NOV35 is expressed in at least the following tissues: B cell germinal, lung, testis, 
prostate, kidney, germ cells, uterus, blood, lymphocyte, thymus, parathyroid, and heart. 
Expression information was derived from the tissue sources of the sequences that were 
included in the derivation of the sequence of NOV35. The sequence is also predicted to be 
expressed in the above tissues because of the expression pattern of (GENBANK-ID: 
gb:GENBANK-ID:HUMZB55G05|acc:AF086155.1), a closely related Homo sapiens full 
length insert cDNA clone ZB55G05 homolog in species Homo sapiens. 

Possible small nucleotide polymorphisms (SNPs) found for NOV35 are listed in 
Table 35C. 



Table 35C: SNPs 


Variant 


Nucleotide 
Position 


Base Change 


Amino Acid 
Position 


Base Change 


13377032 


767 




NA 


NA 


13377033 


2084 


OG 


NA 


NA 



NOV35 also has homology to the amino acid sequences shown in the BLASTP data 
listed in Table 35D. 
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Table 35D. BLAST results for NOV35 


Gene Index/ 
Identifier 


Protein/ Organism 


£*ength. 
(aa) 


Identity 
(%) 


Positives 
(%) 


Rxpect 


gi| 1283 8303 |dbj | 
BAB24157.l| 
(AK005628) 


evidence :NAS~hypo 

thetical 
protein-putative 
[Mus mus cuius] 


506 


242/530 
(45%) 


295/530 
(55%) 


2e-88 


gi 1 5668735 |dbj |B 
AA82657.1 1 
(AB030504) 


UBE-lb [Mus 
mus cuius 3 


300 


48/111 
(43%) 


72/111 
(64%) 


2e-18 


gij 5668733 |dbj |B 
AA82656.l| 
(AB030503> 


UBE-la [Mus 
musculus] 


293 


48/111 
(43%) 


72/111 
(64%) 


3e-18 


gi| 12835589 |dbjt 
BAB23296.l| 
{AK004413) 


cell line 
MC/9.IL4 derived 
transcript l-data 
source :MGD, 
source 
keyrMGI: 102581, 
evidence : ISS~puta 
tive [Mus 
musculus] 


316 


48/111 
(43%) 


72/111 
(64%) 


3e-18 


gi 1 10947000 |ref| 
NP_067532-l| 
(NM_021557) 


cell line 
MC/9.IL4 derived 
transcript 1; 
hypo the t i c al 
protein M42C6 0 
[Mus musculus] 


355 


48/111 
(43%) 


72/111 
(64%) 


6e-18 



The homology of these sequences is shown graphically in the ClustalW analysis 
shown in Table 35E. 



Table 35E. ClustalW Analysis of NOV35 

1) NOV35 (SBQ ID NO: 108) 

12838303 (SEQ ID NO:421) 

5668735 (SEQ ID NO:422) 

5668733 (SEQ ID NO:423) 

12835589 (SEQ ID N0:424) 

10947000 (SEQ ID NO:425) 



2) 
3) 
4) 
5) 
6) 



NOV35a 
gi I 12838303 | 
gi I 5668735 | 
gij 5668733 I 
gij 12835589 [ 
gij 10947 000 1 




MFGFL 

MFGFL 



[.LSriPFILYLVrHKIF 
LSLPFILYLVrgKIP 



70 80 90 100 110 120 

NOV35a 61 EVHSSGASPDSSAIAPALPFPTSHCPSAFPQDPGGVDRIPLGRSWMGRGSQEQMESWEPS 120 

gi 1 12838303 I 58 QFHN EEPGNPDQFUjGSSWD-KESQKPTQPSEPS 90 

gij5668735| 35 g 37 

gij 5668733 j 28 ^ 30 

gij 12835589 I 51 &m 53 

gij 10947000 1 51 @ 53 



130 



140 



150 



160 



170 



180 
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NOV35a 


121 


gi| 12838303 1 


91 


gi|5668735| 


38 


gi|5668733 j 


31 


gi| 12835589 1 


54 


gij 109470001 


54 



NOV35a 


181 


gi 112838303 1 


151 


gi |5668735| 


82 


gi |5668733| 


75 


ol 12835589 { 


98 


gij 10947000 1 


98 


N0V35a 


241 


gi| 12838303] 


208 


gi [56687351 


13 0 


gij 5668733 1 


123 


gij 12835589] 


146 


gij 10947000) 


14 6 


NOV35a 


301 


gi [12838303 | 


260 


gi |5668735| 


174 


gi [5668733 j 


167 


gi| 12835589 [ 


19 0 


gij 109470001 


190 


NOV35a 


355 



12838303 
5668735 [ 
56687331 
12835589 
10947000 



NOV35a 
gi[ 12838303 
gi i 5668735 | 
gij 5668733] 
gi i 12835589 
gij 10947000 



NOV35a 



12838303 
5668735 I 
5668733 j 
12835589 
10947000 



NOV35a 
gi[ 12838303 
gi j5668735[ 
gi 15668733 j 
gij 12835589 
gi il0947000 



sappptsel: 



AEPKlTPSPTATLESFEMAgP 



TAKDLAQRGARVYLACRDVD KG] 
TAKDLAQRGARVYLACRDVDKGl 
;TAKDLAQRGAJlVYLACRDVDKGj 
b AKDLAORGARVYL ACRD VD KG: 



[....t....| U.. .[(.... I 

;pVEEAIX3QSRGLSSAGSlflASF]|LSVEEjPADDADPS 180 
lALEVPHGQBGSIMLA VPlKESi^TAEGEiBsPisS 150 

81 
74 
97 
97 



L AARE I Q AVTGNS QVF VRKLD 
LAAREIQAVTGNSQVFVRKLD 
L AARE I Q AVTGNS QVF WKLD 
L AARE I Q AVTGNS QVFVRKLC 



190 



200 



210 220 230 240 

|... .!....[.... I 
SEGVPCIASGVLVSY 240 
>SKGQ--LHSSPIGSE 207 
13 0 
123 
146 
146 




490 500 510 520 530 540 

473 KIiAPGVS^gRgTQgjLSSG^SQEDKEGSTFPPVEQHPIQTGAPKPSISPAGPGSFCYV 532 
434 KLVPGVS ^^GTC^^^ PA^^^AEEGTPQAP - EQQPIQTGVSKP 479 

298 i^S^^S^ OwESiFGPKRRI.EE -MMI ILOSGONLEPEERRTSSLSCIA 353 

550 560 570 580 590 600 

....|....|,...|....|....[....[,...l....i....|.... !....[.... I 
533 AVGCTQHPGLGRWLCLPYSGLLQLHVQLWQKSHPWDLQCCSTDLTGKIAIVTGANSGIGK 592 

479 479 

300 300 

293 293 

316 316 

354 SS 355 



610 



620 



630 



640 



650 



660 
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NOV35a 
gi [12838303 | 
gi 1 5668735 I 
gi I 5668733 | 
gi 1 12835589 1 
gi 110947000] 



NOV35a 
gi 1 12838303 I 
gi 1 5668735 I 
gi j 5668733 I 
gi 1 12835589 I 
gij 10947000 1 



NOV35a 
gi 1 12838303 I 
gi [5668735 | 
gij 5668733 I 
gij 12835589 I 
gi [10947000 [ 



593 WSQDLARCGAQVILTCQSRECGQQALAEIQAASNSNRLLLGEVDLSSMTSIRSFARRLIi 652 
QVi MKQIR 487 



479 
300 
293 
316 
355 



300 
293 
316 
355 



670 



680 



690 



720 



700 710 

....|....|....|....|....|....t.,.,|....|....|....|. ...[.... I 
653 QENPEIHIiVNN7VGVSGFRI«n*PQGAWISPLSLTMLGPPCSQiySKDLKQGVLPVI^ 712 

488 NETPKAWLLP 497 

300 



300 
293 
316 
355 



293 
316 
355 



730 740 750 760 

....(....[.... I--. .|....|... .[....!... 

713 AEBPGGISGKYFSSSCVITLPVKASRDPHVAQSLWNASYRLTSLVKMD 760 
497 TKPVPHSGS 506 



300 
293 
316 
355 



300 
293 
316 
355 



Tables 35F and 35G list the domain description from DOMAIN analysis results 
against NOV35. This indicates that the NOV35 sequence has properties similar to those of 
other proteins known to contain these domains. 







Table 35F Domain Analysis of NOV35 




gnl |Pfam[pfaTn00l06, adli_short, short chain dehydrogenase. This family 
contains a wide variety of dehydrogenases . 

CD-Length = 249 residues, 51-4% aligned 
Score = 84.7 bits (208), Expect = 2e-17 




NOV35: 
Sb j Ct : 


577 
1 


TGKIAIVTGANSGIGKWSQDLARCGAQVILTCQSRECGQQAIiAEIQAASNSNRIiLLGEV 

IMH^IIII + MM +++ II Ihl^^ ^ 1 ^1 Ik+I +1 1 

TGKVALVTGASSGIGIiAIAKRLAEEGAKWWDRREE- KAEAAAELKAE - LGDRALFIQL 


636 
58 


NOV35: 

Sb j Ct : 
118 


637 
59 


DLSSMTSIRSFARRLLQENPEIHLLVNNAGVSGFRR*- - HLPQGAWI SPLSLTMLGPF - CS 

k+ [|++ + + +111111+ 1 1 + 1 + + +11 + 
DVTDEESIKAAVAQAVEELGia^DVIiVNNAGILGPGEPFELSEDDWERVIDVNLTGVFLLT 


693 


NOV35: 


694 


QIYSKDLKQG 703 (SEQ ID NO:426) 




Sbjct r 


119 


1 

QAVLPHMLKR 128 (SEQ ID NO:427) 
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Table 35G Domain Analy»s of NOV35 



Scores for seciuence fantily classlficarloxx (score includes all domaliis) : 
Model Description Score E-value 



adh short ( Inter Pro ) short chain dehydrogenase 130.8 2.6e-35 1 

SpoU ipethylase ( Inter Pro ) SpoU rRNA Hethylase faraily 24.6 2.8e-06 1 

Idh ( InterPro ) 1 ac tat e/ioa late dehydrogenase S.3 1.3 1 

P2X receptor ( InterPro ) - ATP P2X receptor 4.1 3.S 1 



Parsed for doi»ainsi 



Hodel 


Doinain 


seq-f 


seq-t 


hnsn^f 


hlBDM~t 




score 


E— value 


Idh 


1/1 


10 


31 .. 


1 


25 




5.3 


1.3 


P2 X^receptor 


1/1 


159 


181 


372 


395 




4.1 


3.8 


adh_short 


1/1 


7 


189 . . 


1 


206 


u 


130.8 


2.6e--35 


SpoUjme t hyl as£ 


= 1/1 


490 


607 . . 


1 


152 


u 


24.6 


2.8e-06 



NOV36 

A disclosed NOV36 (designated CuraGen Acc. No. CG57341-01X which encodes a 
novel Short-Chain Dehydrogenase/Reductase-like protein and includes the 2077 nucleotide 
sequence (SEQ ID NO: 109) is shown in Table 36A. An open reading frame for the mature 
protein was identified beginning with an ATG initiation codon at nucleotides 1-3 and ending 
with a TGA stop codon at nucleotides 1978-1980. Putative untranslated regions are 
underlined in Table 36A, and the start and stop codons are in bold letters. 



Table 36A. NOV36 Nucleotide Sequence (SEQ ID NO:109) 

AT6GAGCGGTGGCGCGACCQGCTGGCGCTGGTGACGGG6GCCTCGGGGGGCATCG6CGCGGCCGTGGCCCGGGC 

CCTGGTCCAGCAGGGACTGAAGGTGGTGGGCTGCGCCCGCACTGTGGGCAACaVTCGAGGAGC 

GTAAGAGTGCAGGCTACCCaSGGACTTTGATCCCCTACaXSATGTGACCT 

ATGTTCTCAGCTATCCGTTCTCAGCACAGCGGTGTAGACATCTGCATCAACAATGCTGGCT^ 

CaCCCTGCTCTCAGGCaGCACCAGTGGTTGGAAGGACSlTGTTCAATGTGAACGTGCTGGCC^ 

CACGGGAAGCCTACCAGTCCATGAAGGAGCGGAATGTGGACGATGGGCACATCATTAACATCAATAGCAT^ 

GGCCACCGAGTGTTACCCCTGTCTGTGACCCACTTCTATAGTGCCACCAAGTATGCCGTCACTGCGCTGACAGA 

GGGACTGAGGCAAGAGOTTOKSGAGGCCCAGACCCACATCCGAGCCACGTGGCAGCTTCGGAGGGAGGAOT 

CTGCCGGATATCAGGCAGCCATCACTGTGAAGCrGGGGTTCTGTGGCCTCCATCCTCTCCCCTCGACCTCCCCA 

AGACCTGGCAAA6CTCAGCCCCTGAGAAGGCCCTCTCTGTTGGCCCAGTGCATCTCTCCAGGTGTGGTGGAGAC 

ACAAOTCGCCTTCAAACTCCACGACAAGGACCCTGAGAAGGCAGCTGCCACCm 

AACCCGAGGATGTGGCCGAGGCTGTTATCTACGTCCrrCAGCACCCCCGCACACATCCAGATTGGAGACa^ 

ATGAGGCCCACGGAGCAGAGAGCTCGGCGGAGACGGCTGTCGAGTACCCTTCACCrCGGTGTTGGGAGCC^ 

AGCGAACTGCGGCGCGGCTTACCGCTCCCGGGGACGCAGCAAGGGGCATCGAGTCCCTGGaS^^ 

TGGCATTGCTOTCGACaSTCCGGGGCGCGACCTGGGGTCGCCTCGTCACCCGTCATTTCTCCC^ 

CATGGGGAGCGGCCTGGTGGGGAGGAGCTAAGCCGCTTGCTGCTGGATGACCT 

GCTTCTGTTTGGCATGACCCCGTGTCTCCTGGCTCTGCAGGCCGCCCGCCGCTCTGTGGCCCGGCTCCTGCT^ 
AGGaSGGTAAAGCTGGGCTGCAGGGGAAGOSGGCCGAGCTGCTCaSGATGGCCGAGGCG 

CTGCGGCCCAGACGGCAGAAACTGGACACAATGTGCCGCTACCAGGTCCACCAGGGTGTCTGCATGGAGGTGAG 

CCCGCTGOSGCCCCGGCCTTGGAGAGAGGCCGGGGAGGCGAGCCCAGGCGACGACCCCCAGCAGT^^ 

TCCTCGATGGGATCCAGGATCCCCGGAATTTTGGGGCTGTGCTGCGTTCCGCaCACTTCCT 

ACCAAAGCCCAGCAGGGCTGGCTCX3TGGCCGGCACGGTGGGCTGCCCAAGCACAGAGGATCCCCAGTCCT 

GATCCCCATCATGAGTTGCTTGGAGTTCCTCTGGGAACGGCCTACTCTCCrTGTGCTGG<^^ 

GTCTATCCCAGGAGGTGCAGGCCTCCTGCCAGCTTCTCCFCACCATCCTGCCCCGGCGCCAGCTGCCT 

CTTGAGTCCTTGAAQSTCTCTGTGGCTGCAGGAaTTOTTCTTCACTCCATTTGC^ 

CACAGAGGGGGAGAGAAGGCAGCTTCn*CCaAGACCCCCAAGAACCCTCAGCCSVGGTCTGAAGGGCTC^ 

CTCAGCACCCAGGGCTGTCTTCAGGCCCAGAGAAAGAGAGGCAAAATGAGGGCTG ACGTGGACTGTCCACAGTG 

TTCATGTGCTGGAGTCAGGGACGGCCGCACCTGCCTCCGCCGGCTCCAGTGTGCGGGGAGCCTCTGCCTG^ 

TGCAC 
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The disclosed NOV36 nucleic acid sequence maps to chromosome 17 and has 261 of 
437 bases (59%) identical to a gb:GENBANK-ID:AB035548|acc:AB035548.1 mRNA from 
Streptomyces virginiae (Streptomyces virginiae orf4, orfS genes for ketoacyl ACP/CoA 
reductase homolog, dNDP-glucose dehydratase homolog, complete and partial cds) (E = 7.0e' 

A disclosed NOV36 polypeptide (SEQ ID NO: 1 10) is 659 amino acid residues in 
length and is presented using the one-letter amino acid code in Table 36B. The SignalP, 
Psort and/or Hydropathy results predict that NOV36 does not have a signal peptide and is 
likely to be localized to the mitochondrial intermembrane space with a certainty of 0.7500. 
In alternative embodiments, aNOV36 polypeptide is located to the nucleus with a certainty 
of 0.6000, the mitochondrial matrix space with a certainty of 0.3600 or the microbody 
(peroxisome) with a certainty of 0.3000. 



Table 36B. Encoded NOV36 Protein Sequence (SEQ ID NOrllO) 

l^ERWRDRLALVTGASGGIG^ 
FSAIRSQHSGVDICINNAGLARPDTLLSGSTSGWKDMFlSrVNVIxALSICTREAYQSMKER^^ 
RVLPLSVTHFYSATKYAVTALTEGLRQELREAQTHIRATWQLI^EEAAAGYQAAITVKIiGFCGLHP 
KAQPLRRPSLLAQCISPGVVETQFAFKLHDKDPEKAAATYEQMKCLKPEDVAEAVIYVIaSTPAHIQIGDIQMRPT 
EQRARRRRLSSTLHLGVGSLGANCGAGYRSRGRSKGHRVPGGSCJy^LALLSTVRGATWGRI^VTRHFSHA^^ 
GGEELSRIiLIJ)DLVPTSRLEI^FGMTPCLLAI.QAARRSVARLLLQAGKAGLQGKRAEI>LRMAEARDI^ 
iCL.IXraCRYQVHQGVCMEVSPLRPRPWREAGEASPGDDPQQLWLVLDGIQDPRNFGAVLRSAHFLGVBKT^ 
LVAGTVGCPSTEDPQSSEIPIMSCLEFLWERPTLLVLGNEGSGLSQEVQASCQLIJl^TILPRRQLPPGLESLITVSV 
AAGI LLHS I CSQRKGFPTEGERRQLLQDPQEPS ARSEGLSMAQHPGIiSSGPEKERQNEG 

The NOV36 amino acid sequence was found to have 74 of 192 amino acid residues 
(38%) identical to, and 1 14 of 192 amino acid residues (59%) similar to, the 251 amino acid 
residue ptnr:SPTREMBL-ACC:Q9XYN2 protein from Drosophila melanogaster (Fruit fly) 
(ANTENNAL-SPECIFIC SHORT-CHAIN DEHYDROGENASE/REDUCTASE) (E = 1.7e" 

NOV36 is predicted expressed in at least the following tissues: lung, corresponding 
non cancerous liver tissue, colon, heart, uterus, skin, brain, and placenta. Expression 
information was derived from the tissue sources of the sequences that were included in the 
derivation of the sequence of NOV36. The sequence is predicted to be expressed in the 
above tissues also because of the expression pattern of (GENBANK-ID: gb:GENBANK- 
ID:AB035548|acc:AB035548.1) a closely related Streptomyces virginiae orf4, orfS genes for 
ketoacyl ACP/CoA reductase homolog, dNDP-glucose dehydratase homolog, complete and 
partial cds homolog in species Streptomyces virginiae. 

NOV36 also has homology to the amino acid sequences shown in the BLASTP data 

274 



listed in Table 36C. 



Table 36C- BLAST results for NOV36 


Gene Index/ 
Identifier 


Protein/ Organism 


Length 
(aa) 


Identity 
(%) 


Positives 
(%) 


Expect 


gi 1 13236542 |ref| 
NP__077284.l| 
(NM 024308) 


hypothetical 
protein MGC4172 
[Homo sapiens] 


181 


179/228 
(78%) 


179/228 
(78%) 


4e-95 


gi| 10438968 |dbj | 
BAB15390.l| 
(AK026196) 


unnamed protein 
product [Homo 
sapiens] 


181 


178/228 
(78%) 


178/228 
(78%) 


2e-93 


gi 1 14495621 |gb| A 
AH09416.l|AAH094 
16 (BC009416) 


hypothetical 
protein FLJ22578 
[Homo sapiens] 


158 


142/145 
(97%) 


143/145 
(97%) 


8e-73 


gi 1 13376296 |ref| 
NP_079140.1| 
(NM 024864) 


hypothetical 
protein FLJ22578 
[Homo sapiens] 


155 


142/145 
(97%) 


143/145 
(97%) 


8e-73 


gi| 13542856 |gb|A 
AH05625.l|AAH056 

25 (BC005625) 


Similar to 
hypothetical 
protein FLJ22578 
[Mus musculus] 


124 


91/114 
(79%) 


99/114 
(86%) 


le-43 



The homology of these sequences is shown graphically in the ClustalW analysis 
shown in Table 36D. 



1) NOV36 



Table 36D. ClustalW Analysis of NOV36 

(SEQ ID NOrllO) 



2) gi I 14495621 (SEQ ID NO:428) 

3) gi I 13376296 (SEQ ID N0:429) 

4) gi|l3542856 (SEQ IDN0:430) 

5) gi I 13236542 (SEQ ID NO:431) 

6) gi|l0438968 (SEQ ID NO:432) 



NOV3 6 


1 


gi 


14495621] 


1 


gi 


13376296 1 


1 


gi 


135428561 


1 


gi 


13236542 1 


1 


gi 


104389681 


1 



10 



30 



40 



SO 



60 



NOV36 

gi I 14495621 I 
gi [133762961 
gi 1 13542856 I 
gi j 13236542 j 
gi 1 10438968 I 



NOV36 

gi[l449562l| 
gi 1133762961 
gi 1 13542856 I 
gi 1 13236542 I 
gi 1 10438968 I 



61 

1 

1 

1 

1 

1 



....i....i....t....r-.>i.---i--..i--'*i--^.t--..i.-«.i-.--i 

MERWIU3RIii^VTGASGGIGAAVAlU^VQQGLKWGC7UlWGNIEEIAM 6 0 

1 

1 

1 

1 

1 

70 80 90 100 110 120 

PYROJI^NEEDILSMFSAIRSQHSGVDI CINNAGLARPDTLLSGSTSGWKE»1ENVNVLAI. 120 

1 

1 

1 

- 1 

1 



130 140 150 160 170 180 

....|....t....|....|....|..--|...-|....|-...|....|....|...-| 
121 SICrREAYQSMKERNVDDGHIININSMSGHRVLPLSVTHFYSATKYAVTALTEGLRQELR 180 

1 - 1 

1 ~- 1 

1 - 1 

1 1 

1 1 
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KOV36 

gi 1 14495621 
gi 1 13376296 
gx I 13542856 
gi 1 13236542 
gi I 10438968 



NOV3 6 

gi 1 14495621] 
gi 1 13376296 I 
gi 1 13542856 I 
gi I 132365421 
gi 110438968 1 



NOV36 



14495621 
13376296 
13542856 
13236542 
10438968 



NOV36 

gi I 14495621 I 
gij 13376296 I 
gi 1 13542856 I 
gij 13236542 I 
gij 10438968 I 



NOV36 

gi I 14495621 
gij 13376296 
gij 13542856 
gij 13236542 
gij 10438968 



NOV36 

gi 1 14495621 1 
gi 1 13376296 j 
gij 13542856 I 
gij 13236542 I 
gij 10438968 I 



NOV36 

gi I 14495621 
gij 13376296 
gij 13542856 
gij 13236542 
gij 10438968 



190 200 210 220 230 240 

181 EAffTHIRATWQLRREEAAJWSYQAAITVKI/SFCGLHPLPSTSPRPGKAQPLR^ 240 

1 

1 

1 

1 

1 



250 



260 



270 280 290 300 

.,..l....|....|....l....|....t....t---.i...-l-...|--^-l----l 
241 SPGVVETQFAFKLHDKDPEKAAATYEQMKCLKPEDVAEAVIYVLSTPAHIQIGDIQMRPT 3 00 

1 

1 

1 

1 

1 



310 



320 330 340 350 360 

301 EQRAiUaUU*SSTLHIX3VGSLGANCGAGYRSRGRSKGHRVPGGSCAMALI.STVRGATO 360 



370 



380 



390 



400 



410 



....|....|....|....|....|....|....|....t....|....l....|...^l° 

361 WRHFSHAARHGERPGGEELSRLLIJ>DLVPTSRLEI.IiFGMTPCLLALQAARRSVARLLLQ 420 

1 

1 

1 

1 

1 



470 



480 



430 440 450 460 

....|....|....|....|....l....t....|....|....|----l....|..--l 

421 AGKAGLQGKRAEIJ^RMA£ARDIPVIjRPRRQKIJDTMCRYQWQGVCMBVSPI^ 480 



530 



540 



533 
32 
29 
29 

'SGWKDMFNV 41 
GWKDMFNV 41 




610 



620 



NOV36 


590 


gi|l449562l[ 


89 


gij 13376296 1 


86 


gij 13542856] 


86 


gij 13236542] 


102 


gij 104389681 


102 



P?GLESLNVSVAAGILLHSICSQ|KGFF 
PPGL3SLNVSVAAGILLKSICSQ'JkGFP 

p?gleslnvsvaagillhsicsq|kgfp 
p?gleslwsv2j5gillhsicsq|kgff 
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670 



680 



NOV36 

gi I 14495621 I 
gi 1135762961 
gi 1135428561 
gi 1 13236542 I 
gi 1 10438968 I 




ISSGPBpEI^>IEG 659 
lsSGPE(ERft5EG 158 
IsSGPElERpfEG 155 

124 

181 
181 



Table 36E lists the domain description from DOMADSf analysis results against 
NOV35. This indicates that the NOV35 sequence has properties similar to those of other 
proteins known to contain fliese domains. 



Table 36E Domain Analysis of NOV36 

gnl |Pfam|pfam0 0106, adli_short, short chain dehydrogenase. This family contains 
a wide variety of dehydrogenases. 

CD- Length =249 residues, 69.9% aligned 

Score = 144 bits (364) , Expect * le~35 



NOV36: 


6 


DRLAX.VTGASGGIGAAVARALVQQGLKWGCARTVGNI EELAAECKS AGYPGTL I P YRCD 


65 






111 1 ++I III 1 II 1 ^ - 1 




Sbjct: 


2 


GKVAIiVTGASSGIGLAIAKRIJySEGAKVVVVDRREEKAEAAAELKAEIXS- -DRALFIQLD 


59 


NOV36: 


66 


LSNEEDILSMFSAIRSQHSGVDICIISINAGIJ^RPDTLLSGSTSGWKDMFNVNVIJ^ 


125 






+++II 1 + + + +k -^llth t 1 1+ + +lk + + 1 + 




Sbjct: 


60 


VTDEESIKAAVAQAVEEIXSRIJDVLVlSrNAGILGPGEPFEI^EDDWERVIDVlS^ 


119 


NOV36: 


126 


EAYQSMKERNVDDGHIININSMSGHRVLPLSVTHFYSATKYAVTALTEGLRQELREAQTH 


185 






1 +h 1 IHhhH 1 IIIH II 1 1 II 1 




Sbjct: 


120 


AVLPHMLKRS — GGRIVNISSVAGLVPSPGLSA- - YSASKAAWGFTRSLALEI,- -APHG 


173 


NOV36: 


186 


IR 187 {SEQ ID NO: 433) 




Sb j Ct : 


174 


M 

XR 175 (SEQ ID NO:434) 





gnl |Pfam|pf aTn00588, SpoU_methylase, SpoU rRNA Methylase family. This family of 
proteins probably use S~AdoMet . 

CD-Length = 143 residues, 97.9% aligned 







Score = 60.8 bits (146), Expect - 2e- 


■10 




NOV36: 


493 


LVLDGIQDPRNFGAVLRSAHFIxGVD 




517 






+ 111 1 1 ll++k INI 






Sb j ct : 


4 


VVIJ>BVEXPHNIGAIIRTCAALGVDGIVIVDDGFAIjLDRRLRRASLGYAESVPVIRVDm 


63 


NOV36: 


518 


KTKAQQGWLVAGTVGCPSTEDPQSSEI P IMSCLEFLWERPTLLVLGNEGSGLSQE 


572 








+ II l+l +111 




Sb j ct : 


64 


eeflahlkesgiwij:.t tsgdgnadpld 


■ - YEDGAKRLALVFGSETTGLSNL 


112 


NOV36: 


573 


VQASCQLLLTILPRRQLPPGLESLNVSVAAGILLH 


607 (SEQ ID NO: 435) 








+ 1 + + lllllll hll + 






Sb j Ct : 


113 


ALEPADQRIRI PMNGDVRSLNVSVAVGIjLLY 


143 (SEQ ID NO:436) 





gnl |Pfam|pfam013 70, Epimerase, NAD dependent epimerase/dehydratase family. 
This family of proteins utilise NAD as a cof actor. The proteins in this family 
use nucleotide -sugar substrates for a variety of chemical reactions. 

CD- Length - 310 residues, 4 9.0% aligned 

Score ~ 37.7 bits (86), Expect = 0.002 

NOV36r 10 LVTGASGGIGAAVARALVQQG-LKWGCAR--TVGNIEELAAECKSAGYPGTLIPYRCDL 66 
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III! +1 11+ + I h I III II I I + 1+ 



Sb j Ct : 


2 


LWGGAGFIGSHLVRELIiNNGDDKVVVLDNIiTYAGNEAR^ TFVKGDI 


57 


NOV36: 


67 


SNEEDILSMFSAIRSQHSGVDICII^AGLARPDTLLSGSTSGWKraFNVNVL^ 


126 






+ + + -^| +11+1 + 1+ + II 1 i++ 




Sbjct: 


58 


CDRDLIiDKVF AENQPDAVIHFAAESHVDRSIEKPIjAyiDT--NV-VGTLTLL 


106 


NOV36: 


127 


AYQSMKERlSrvr)DGHIININSMSG-HRVLPI.SVTH FYSATKYAV 168 (SEQ ID NO 


:437) 






++ ++ 1 + + + 1 1 l+l + 




Sbjct: 


107 


--EAARKAGVFKFVFSSTDEVYGDLPSIPITEDTPYGPSSPYGASKASS 153 (SEQ ID NO 


:438) 



The novel human short chain dehydrogenase/reductase - like proteins of the invention 
contains dehydrogenase/reductase domains. Therefore it is anticipated that these novel 
proteins have a role in the regulation of essentially all cellular functions and could be 
potentially important targets for drugs. Such drugs may have important therapeutic 
applications. 

The short-chain dehydrogenases/reductases family (SDR) (See Joemvall et al^ 
Biochemistry 34: 6003-6013 (1995); InterPro IPR002198) is a very large family of enzymes, 
most of which are known to be NAD- or NADP-dependent oxidoreductases. As the first 
member of this family to be characterized was Drosophila alcohol dehydrogenase, this family 
used to be called 'insect-type', or 'short-chain' alcohol dehydrogenases. Most member of this 
family are proteins of about 250 to 300 amino acid residues. Most dehydrogenases possess at 
least 2 domains, the first binding the coenzyme, often NAD, and the second binding the 
substrate. This latter domain determines the substrate specificity and contains amino acids 
involved in catalysis. Little sequence similarity has been found in the coenzyme binding 
domain although there is a large degree of structural similarity, and it has therefore been 
suggested that the structure of dehydrogenases has arisen through gene fusion of a common 
ancestral coenzyme nucleotide sequence with various substrate specific domains. This 
indicates that the sequence of the invention has properties similar to those of other proteins 
known to contain this/these domain(s) and similar to the properties of these domains. 

Wang et aL (J Biol Chem 1999 Apr 9;274{1 5): 1 0309-15) show that a short chain 
dehydrogenase/reductase and a cytochrome P450 are expressed specifically or preferentially 
in the olfactory organs, the antennae. The evolutionarily conserved expression of 
biotransformation enzymes in olfactory organs suggests that they play an important role in 
olfaction. 

The protein similarity information, expression pattern, cellular localization, and map 
location for the NOV35 and NOV36 proteins and nucleic acids disclosed herein suggest that 
this short-chain dehydrogenase/reductase-like protein may have important structural and/or 
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physiological functions characteristic of the dehydrogenase/reductase family. Therefore, the 
nucleic acids and proteins of the invention are useful in potential diagnostic and therapeutic 
applications and as a research tool. These include serving as a specific or selective nucleic 
acid or protein diagnostic and/or prognostic marker, wherein the presence or amount of the 
nucleic acid or the protein are to be assessed. These also include potential therapeutic 
applications such as the foUowuig: (i) a protein therapeutic, (ii) a small molecule drug target, 
(iii) an antibody target (therapeutic, diagnostic, drug targeting/cytotoxic antibody), (iv) a 
nucleic acid useful in gene therapy (gene delivery/gene ablation), (v) an agent promoting 
tissue regeneration in vitro and in vivo, and (vi) a biological defense weapon. 

The NOV35 and NOV36 nucleic acids and proteins of the invention have applications 
in the diagnosis and/or treatment of various diseases and disorders. For example, the 
compositions of the present invention will have efficacy for the treatment of patients 
suffering from: systemic lupus erythematosus, autoimmune disease, asthma, emphysema, 
scleroderma, allergy, ARDS, Von Hippel-Lindau (VHL) syndrome, cirrhosis, 
transplantation, cardiomyopathy, atherosclerosis, hypertension, congenital heart defects, 
aortic stenosis, atrial septal defect (ASD), atrioventricular (A-V) canal defect, ductus 
arteriosus, pulmonary stenosis, subaortic stenosis, ventricular septal defect (VSD), valve 
diseases, tuberous sclerosis, scleroderma, obesity, transplantation, Alzheimer's disease, 
stroke, tuberous sclerosis, hypercalceimia, Parkinson's disease, Huntington's disease, cerebral 
palsy, epilepsy, Lesch-Nyhan syndrome, multiple sclerosis, ataxia-telangiectasia, 
leukodystrophies, behavioral disorders, addiction, anxiety, pain, neurodegeneration, psoriasis, 
actinic keratosis, tuberous sclerosis, acne, hair growth/loss, allopecia, pigmentation disorders, 
endocrine disorders, endometriosis, fertility as well as other diseases, disorders and 
conditions. 

These antibodies may be generated according to methods known in the art, using 
prediction from hydrophobicity charts, as described in the "Anti-NOVX Antibodies" section 
below. The disclosed NOV35 protein has multiple hydrophilic regions, each of which can be 
used as an immunogen. In one embodiment, a contemplated NOV35 epitope is from about 
amino acids 2 to 20. In another embodiment, a contemplated NOV35 epitope is from about 
amino acids 60 to 65. In other specific embodiments, contemplated NOV35 epitopes are 
from about amino acids 105 to 130, 160 to 167, 190 to 220, 221 to 225, 270 to 290, 310 to 
320, 390 to 410, 425 to 460, 490 to 515, 570 to 580, 610 to 620, 670 to 690, 760 to 770. The 
disclosed NOV36 protein has multiple hydrophilic regions, each of which can be used as an 
immunogen. In one embodiment, a contemplated NOV36 epitope is from about amino acids 
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50 to 70. In another embodiment, a contemplated NOV36 epitope is from about amino acids 
100 to 105. In other specific embodiments, contemplated NOV36 epitopes are from about 
amino acids 110 to 120, 190 to 200, 220 to 225, 270 to 275, 290 to 305, 320 to 325, 380 to 
385, 430 to 460, 490 to 505 and 610 to 660. 

NOV37 

A disclosed NOV37 (designated CuraGen Acc. No. CG57335-01), which encodes a 
novel Protocadherin beta 3 -like protein and includes the 3010 nucleotide sequence (SEQ ID 
NO:l 1 1) is shown in Table 37A. An open reading frame for the mature protein was 
identified beginning with an ATG initiation codon at nucleotides 429-431 and ending with a 
TAA stop codon at nucleotides 2817-2819. Putative untranslated regions are underlined in 
Table 37A, and the start and stop codons are in bold letters. 



Table 37A. NOV37 Nucleotide Sequence (SEQ BO NOrlll) 

AATCTTTTTTTTTTTTTTTTTTTTCGTAGATAAAAGTGCATTTTATTTCCCTAGATTGCATTTATTTAATTCa.TATAA^ 

CATGAGAAACTCCTCCAGTAGCGTCAACTAGGGTTGATAAGAATAATCGATAAAGCAAAATAAAAACACCTTCTCCAA 

GATTTTGTAACTGCAAGCGAACGCATGGTGGCGCTGTTGACTAAGAAGGCGAATTAAACCACAGGCATTGTGCATGCT 

CGGTGACGCACGGATCCAGTGTGGTAAACCAGCGGTTGAGAGCCCAGGCAGATTTTTGAGCCAGCAAGTCTGAGCCTC 

TGGAAAGGCTTATTCACTAGGCCGTCTACAAAGGTTGTGGGGCAAAAGACTGTTTCCCAGCTCTGTCTGAGGTTCAGC 

TTGGCGACATTCCCTGGAAGAGCGTGACGGAAAGTGCAA TGGAGGCGGGAGGAGAGCGATTTCTTAGACAAAGGCAAG 

TCTTGCTTCTCTTTGTTTTTCTGGGAGGGTCTCTGGCTGGGTCCGAGTCAAGACGCTATTCTGTG6CTGAGGAAAAAG 

AGAAGGGCTTTTTAATAGCCAACCTAGCAAAGGATCTGGGACTAAGGGTAGAGGAACTGGCCGCGAGGGGGGCCCAAG 

TTGTGTCCAAAGGGAACAAACAGCATTTTCAGCTCAGTCATCAGACAGGTGATTTGCTCCTGAATGAGAAATTGGACC 

GGGAGGAGCTATGCGGCCCCACAGAACCATGCATACTACATTTTCAGATATTACTGCAAAACCCTTTGCAATTCGTT^ 

CAAACGAGCTCCGTATCATAGATGTAAATGACCATTCTCCGGTATTCTTTGAAAATGAAATGCATCTGAAAATCCTAG 

AAAGCACTCTGCCAGGAACAGTAATTCCrTTGGGAAATGCTGAGGACTTGGATGTGGGAAGAAACa.GCCT 

ACACTATCACTCCGAATTCCCACTTCCACGTACCCACTCGCAGTCGTAGGGACGGAAGGAAGTACCCGGAACTAGTAC 

TGAACAGAGCCCTGGATCGCGAGGAGCAGCCTGAGATCAGGTTAACCCTCACAGCGCTAGATGGCGGGAGTCCACCCA 

GGTCCGGCACGGCCCTGGTACGGATTGAAGTTGTGGACATCAATGACAACGTCCCAGAGTTTGCAAAGCTGCTCTATG 

AGGTGCAGATCCCGGAGGACAGCCCCGTTGGATCCCAGGTTGCCATCGTCTCTGCCAGGGATTTAGACATTGGAACTA 

ATGGAGAAATATCTTATGCATTTTCCCAAGCATCTGAAGACATTCGCAAAAC6TTTCGATTAAGTGCAAAATCGGGAG 

AACTGCTTTTAAGACAGAAACTGGATTTCGAATCCATCCAGACATACACAGTAAATATTCAGGCGACAGATG^ 

GCCTATCCGGAAAGTCTACAGTCATAGTCCAGGTGGTTGATGTCyUWCGACAACCCACCGGAACTGACCTTGTCTTC^ 

TAAACAGCCCTATTCCTGAGAACTCGGGAGAGACTGTACTGGCTGTTTTCAGTGTTTCTGATCTAGACTCTGGAGAa^ 

ACGGAAGAGTGATGTGTTCCATTGAGAACAATCTCCCCTTCTTCCTGAAACCATCTGTAGAGAATTTTTACACCCTAG 

TGTCAGAAGGCGCGCTGGACAGAGAGACCAGATCCGAGTACAACATTACCATCACTATCACTGACCTGGGGACACCCA 

GGCTGAAAACCAAGTACAACATAACCGTGCTGGTCTCCGACGTCAATGACAACGCCCCCGCCTTCaVCCCaU^ 

ACACCCTGTTCGTCCGCGAGAACAACAGCCCCGCCCrGCACATCGGCaGTGTC^GCGCCACA6Aa^(^ 

CCAACGCCCAGGTAACCTACTCGCTGCTGCCGCCCCAGGACCCGCACCTGCCCCTCTCTTCCCTGGTCTCCATC^ 

CGGACAACGGCCACCTGTTTGCCCTCAGGTCGCTGGACTACGAGGCCCTGCAGGCGTTCGAGTTCCG^ 

CAGACCGTGGCTCCCCGGCTTTGAGCAGCGAGGCGCTGGTGCGCGTGCTGGTGCTGGACGCCAACGACAACTCGCCCT 

TCGTGCTGTACCCGCTGCAGAACGGCTCCGCGCCCTGCACCGAGCTGGTGCCCCGGGCGGCTGAGCCGGGCTACCTGG 

TGACCAAGGTGGTGGCGGTGGACGGCGACTCXSGGCCAGAACGCCTGGCTGTCGTACCAGCTGCTCaAG^ 

CCGGGCTGTTCGGCGTGTGGGCGCACAATGGCGAAGTGCGC7VCCGCCAGGCTGCTGAGGGAGCGCGACGCTGCCAA 

AGAGGCTGGTGGTGCTGGTCAAGGACAATGGCGAGCCTCCGCGCTCGGCCACCGCCACGCTGCAC6TGCTCCTGG 

ACGGCTTCTCCCAGCCCTACCTGCTGCTCCCGGAGGCGGCACCGGCCCAGGCCCAGGCCGACrTGCTCACCGTCTACC 

TGGTGGTGGCGTTGGCCTCGGTGTCTTCGCTCTTCCTCTTCTCGGTGCTCCTGTTCGTGGCGGTGCGGCTGTGCAGGA 

GGAGCAGGGCGGCCTCGGTGGGTCGCTGCTCGGTGCCCGAGGGCCCCTTTCCAGGGCAGATGGTGGACGTGAGCGGCA 

CCGGGACCCTGTCCCAGAGCTACCAGTACGAGGTGTGTCTGACTGGAGGCTCCGGGACAAATGAGTTCAAGTTCCTGA 

AGCCAATTATCCCCAACTTCGTTGCrCAGGGTGCAGAGAGGGTTAGCGAGGCAAATCCCAGTTTCAGGAAGM 



CCATTGGAGGTGTCTCCTTTTATTAGAAAGTAACCATCTTATTCCAATTCTATGCATGTTACTGGTATTTATAAATG 
ATGAGTTTTTTTGCGGTATAATAAATGTAAATTTTCTTTGTATTCT 
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The disclosed NOV37 nucleic acid sequence maps to chromosome 5 and has 2257 of 
2391 bases (94%) identical to a gb:GENBANK-ID:AF152496|acc:AF152496.1 mRNA from 
Homo s^iens (Homo sapiens protocadherin beta 3 (PCDH-beta3) mRNA, complete cds) (E 
=0.0). 

A disclosed NOV37 polypeptide (SEQ ID NO:l 12) is 796 amino acid residues in 
length and is presented using the one-letter amino acid code in Table 37B. The SignalP, 
Psort and/or Hydropathy results predict that NOV37 has a signal peptide and is likely to be 
localized to the plasma membrane with a certainty of 0.4600. In alternative embodiments, a 
NOV37 polypeptide is located to the endoplasmic reticulum (membrane) with a certainty of 
0.1000, the endoplasmic reticulum (lumen) with a certainty of 0.1000 or the outside of the 
cell with a certainty of 0.1000. The SignalP predicts a likely cleavage site for a NOV37 
peptide between amino acid positions 26 and 27, /.e. at the sequence SLA-GS. 



Table 37B. Encoded NOV37 Protein Sequence (SEQ ID NO:112) 

MEAGGERFLRQRQVLLLFVFLGGSLAGSESRRYSVAEEKEKGFLIANIAKDLGLRVEEIJy^ 

SHQTGDIiLLNEKLJ5REELCGPTEPCILHFQIIiLQNPLQFVTNELRIIDVNDHSPVFFENEN^ 

GNAEDLDVGRNSLQNYTITPNSHFHVPTRSRRDGRKYPELVIiNRALDREEQPEIRLTLTAIiDGG^ 

WDINDNVPEFAKLLYEVQXPEDSPVGSQVAIVSARDLDIGTNGEISYAFSQASEDIRKTFRLSAKSGELLLRQKLD 

FES IQTYTVNI QATDGGGLSGKSTVI VQWDVNDNPPELTLSS VNS PI PENSGETVI*AVFSVSDLDSGDNGRWICSI 

ENNLPFFLKPSVENFYTLVSEGALDRETOSEYNITITITDLGTPRLKTKYNITVLVSDVlSroN^ 

NNSPALHIGSVSATDRDSGTNAQVTYSLLPPQDPHLPLSSLVSINADNGHLFALRSLDYEALQAFEFRVGATDRGSP 
ALSSEALVRVLVLDANDNSPFVLYPLQNGSAPCTEIiVPRAAEPGYLVTKVVAVDGDSGQNAWLSYQLLKATEPGLF^ 
VWAHNGEVRTARLLRERDAAKQRLVVLVKDNGEPPRSATATIJIVLLVTC 

IJ^SVSSLFLFSVIiLFVAVRLCRRSRAASVGRCSVPEGPFPGQMVDVSGTGTLSQSYQYEVCLTGGSGTNEFKFLK 

I PNFVAQGAERVSEANPSFRKSFEFT 

The NOV37 amino acid sequence was found to have 742 of 796 amino acid residues 
(93%) identical to, and 767 of 796 amino acid residues (96%) similar to, the 796 amino acid 
residue ptnr:SPTREMBL-ACC:Q9Y5E6 protein from Homo sapiens (Human) 
(PROTOCADHERIN BETA 3) (E = 0.0). 

NOV37 is predicted expressed in at least the following tissues: brain, spinal cord, 
cartilage, heart, stomach, testis, urinary bladder, and oviduct/uterine tube/fallopian tube. 
Expression information was derived from the tissue sources of the sequences that were 
included in the derivation of the sequence of NOV37. 

Possible small nucleotide polymorphisms (SNPs) found for NOV37 are listed in 
Table 37C. 
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Table 37C: SNPs 


Variant 


Nucleotide 
Position 


Base Change 


Amino Acid 
Position 


Base Change 


13377034 


302 


G>C 


NA 


NA 


13377035 


457 


G>T 


10 


Arg>Ile 


13377036 


2537 


OT 


NA 


NA 



NOV37 also has homology to the amino acid sequences shown in the BLASTP data 
listed in Table 37D. 



Table 37D. BLAST results for NOV37 


Gene Index/ 
Identifier 


Protein/ Or9anlsm 


I«ength 
(aa) 


Identity 
(%) 


Positives 
(%) 


Expect 


gi|9256614|ref |N 
P_061760.l| 
(NM 018937) 


protocadherin 
beta 3 precursor 
[Homo sapiens] 


796 


742/796 
(93%) 


767/796 
(96%) 


0.0 


gi 1 9256612 |ref|N 
P_061759.l| 
<NM 018936) 


protocadlierin beta 
2 precursor [Homo 
sapiens] 


798 


693/798 
(86%) 


730/798 
<90%) 


0.0 


gi|l3431369lsp|Q 
9NRJ7 1 CDBG_HUMAN 


Protocadherin beta 
16 precursor 
(PCDH-betal6) 

(Protocadlierin 3X) 


776 


605/774 
(78%) 


682/774 
(87%) 


0.0 


gi [14195605 jref| 
NP_066008.l| 
(NM_020957) 


protocadherin 
beta 16 
precursor ; 
protocadherin 
beta 8a; 
protocadher in- 3x ; 
cadherin MEl 
[Homo sapiens] 


776 


603/774 
(77%) 


681/774 
(87%) 


0.0 


gi 1 10047319 |abj | 
BAB13447.l| 
(AB046841> 


KIAA1621 protein 
[Homo sapiens] 


787 


599/774 
(77%) 


677/774 
(87%) 


0.0 



The homology of these sequences is shown graphically in the ClustalW analysis 
shown in Table 37E* 



Table 37E. ClustalW Analysis of NOV37 

1) NOV37 (SEQ ID NO: 112) 

2) gi I 9256614 (SEQ ID NO:439) 

3) gi I 9256612 (SEQ ID NO: 440) 

4) gi|l3431369 (SEQ ID NO:441) 

5) gi 1 14195605 (SEQ ID NO:442) 

6) gi| 10047319 (SEQ ID NO:443) 



10 20 30 40 50 60 



NOV37 
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gi I 9256614 
gi I 9256612 
gi|l3431369 
gi I 14195605 
gi I 10047319 



NOV37 

gi I 9256614 I 
gi I 9256612 I 
gi I 13431369 
gi I 14195605 
gi I 10047319 



N0V37 

gi I 9256614 | 
gij 9256612 I 
gi 1 13431369 
gij 14195605 
gij 10047319 



NOV37 

gi I 9256614 I 
gij 9256612 j 
gij 13431369 I 
gij 14195605 1 
gij 10047319 1 



NOV37 

gi I 9256614 | 
gi 192566121 
gij 13431369 I 
gij 14195605 I 
gij 10047319] 



NOV37 

gi 1 9256614 I 
gi 19256612 j 
gij 13431369 | 
gi i 14195605 j 
gi 1 10047319 j 



NOV37 

gi I 9256614 I 
gij 9256612 I 
gi|13431369| 
gi jl4195605 j 
gij 10047319 j 



NOV37 

gi I 9256614 I 
gi I 9256612 j 
gij 13431369 I 
gij 14195605 j 
gi I 10047319 j 



N0V37 





100 



110 



120 



I . 



tqtgdll|nhkldreelcg?tepcilh?g 
iqtgdllfnskldreelcgptepcilhfc 
'qtgdll|nekldreelcgpte?c|l3fq 

QTGDLL&fEKLDREELCGPTEPCILKFQ 

qtgdllInekldrselcgptepcilhfg 
qtgdllSiekldrselcgptepcilhfc 



107 
107 
109 
107 
107 
118 



13 0 



140 



150 



160 



170 




490 



500 



510 



520 



I 



53 0 



540 
1 



468 



LHIGSVSATDRDSGTNAQVTYSLLPPQDPHLPLiSLVSINADNGHLFALRSLDYEALQAJr 



527 
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gi I 9256614 1 
gi I 9256612 I 
gi 113431369 I 
gi 1 14195605 j 
gi 1 10047319 | 



NOV37 

gi I 9256614 I 
gi I 9256612 I 
gi 1 13431369 I 
gi 1 14195605 | 
gi 1 10047319 I 



NOV37 

gi I 9256614 I 
gi I 9256612 I 
gi 1 13431369 I 
gi 1 14195605 I 
gi I 10047319 1 



468 
470 
468 
468 
479 



lhigsvsatdrdsgtnaqytysllppqdpklfl|slvsinadngklfalrsldyealqaf: 
lkigsvsatdrdsgtnaqvtysllppqdphlplaslvsinadngklfal|sldyealqaf 

LKIGSVSATDRDSGTNAQVTYSLLPPQDPHLPLASLVSINADNGKLFALRSLDYEALQAF 
LHIGSVSATDRDSGTNAQVTYSLLPPQDPHLPLASLVS INADNGHLFALRSLDY'EAUj |f 
LHIGSVSATDRDSGg^TAQVTYSLLP?QD?HL?LASLVSINAI)NGHLFALRSLDYSAI..Il5F 



527 
529 
527 
527 
538 



550 



560 



570 



580 



590 



600 



528 
528 
530 
528 
528 
53 9 



588 
588 
590 
588 
588 
599 



EFRVGATDRG S PAL SS EALWVLVLDANBNS P FVL YPLQNGS A? CT ELV?RAA£ PGYLVT 
E FRVGATDRGS P AL S S EALVRVLVLDANDNS P FVLYPLQNGS A? CT ELVFRAAE PGYLVT 
E FRVGA^DRGS P ALS S EALVRVLVLDANDNS P FVLYPLQNG S APCT EL VPRAAE PGYLVT 
EFRVGATDRG SPALS^EALVRVLVLDANDNSPFVLYPLQNGSAPCTELVPRAAEPGYLVT 
S FRvSaTDRG S PALS S EALVRVLVLDANDNS PFVLYPLQNG SAP CTELVPRAAE PGYLVT 
H FRvSaTDRG S PALS S EALVRVLVLD ANDNS PFVLYPLQNGS AP CT ELVPRAAE PG YLVT 



610 



620 
. . 1 . . 



I 



630 



I . 



640 



650 



660 



I 



KVVAVDGD SGQNAWLS YQLLKATE PGL FGV^JAHNGE WTARLLgERDAAKQRL WLVKDN 
KWAVDGC SGQNAWLSYQLLKATEPGLFGWAHNGEVRTARLLS ERDAAKjJiRLVVLVKDK 
KVVAVDGD SGQN AWLS YQLLKATE PGLFGWAHNGEVRT ARLLjgERDAAKQRLVVLVKDN 
KWAVDGD SGQNAWLS YQLLKATE PGLFGWAHNGE VRTARLL S ERDAAKQRL WLVKDK 
KVVAVDGDSGQNAWLSYQLLKATEPGLFGVWAHNGSVRTARLLSSRDAAKQRLVVLVKDK 
:Ki7VAVDGDSGQNAWLSYQLLKATEPGLFGVl-?AHNGSWTARLLSERDAAKg|RLVVLVKDN 



670 

t 



680 

I . 



690 



700 



710 



720 



I, 



NOV37 


64 8 


gi 


9256614 1 


64 8 


gi 


9256612 1 


650 


gi 


13431369 1 


648 


gi 


14195605] 


648 


gi 


100473191 


659 



SPPRSATATLKVLLVDGFSQPiLijLPEAAPgQgQ^ 

gepprsatatlhvllvdgfsqpilplpeaapSqSqaI 

GEPPRSATATLHVLLVDGFSQ?|LilLPSAAPgQ|Q.AM 

GEPPRSATATLHVLLVDGFSQp|L?LPEAJ\P|QjjQA| 

GEPPRSATATLHVLLVDGFSQP|LPLPEAAP|QJJQa| 

geppSsatatlhvllvdgfsqplilplpsaapSquqaS 



ltvyl WALASVSS LFL F s vll 
LTVYLWAJLASVSS LFLFSVLL 
5LTVYLWALASVSS LFLFSVLL 
LTA/"f LWALAS VSS LFLFSVLL 
LTVYLWALASVSS LFLFSVLL 
3LTVYLWALASVSSLFLFSVLL 



707 
707 
709 
707 
707 
718 



73 0 
. . I . . 



I . 



I , 



750 



760 



770 



1. 



1TOV37 


708 


gi| 9256614 1 


708 


gi| 92566121 


710 


gi| 134313691 


708 


gi| 141956051 


708 


gi 1 10047319 1 


719 


NOV37 


768 


gi| 9256614] 


768 


gi] 9256612] 


770 


gi 113431369 1 


768 


gi 114195605 1 


768 


gi] 100473191 


779 



FVAVRLCRRSRAASVGRCSlPEGPFPGMVDVSGTGTLSQSYQYEVCLTGGSgT|EFKFI 



780 
.^1 



FVAVRLCRRSRAASVGRCS|PEGPFPG^VDVSGTGTLSQSYQYEVCLTGGS|T|EFKFL 



FVAVRLCRRSRAASVGRCSiPEGPFPGg■'^roVSGTGTLSQSYQYEVCLTGGSgT!i^ 
FVAVRLCRRSRAASVGRCsipEGPFPGSv^VSGTGTLSQSYQYEVCLTGGSjjTlsFKFI 
FVAVRLCRRSRAASVGRCsipEGPFPGiJ^^7DVSGTGTLSQSYQYEVCLTGGS52TisEFKFI 



FVAVRLCRRSRAASVGRCSlPEGPFPi 
FVffiVRLCRRSRAASVGRCsIPSGFFP 



ILSQSYQYEVCLI 
rLSQSYQYEVCLI 



767 
767 
769 
767 
767 
778 



I- 



KPIIPNFl 
KPIIPNFj 

kpiipnf' 

KPIIPNF 
KPIIPNF 



790 800 
..|....|....|....|.... 

yAQGAERVSEANPSFRKSFEFT 796 
^AQGAERVSEANPSFRKSFEFS 796 
VAQGAERVSEANPSFRKSFEFT 798 

SP 776 

Sp 776 

SP 787 



Table 37F lists the domain description from DOMAIN analysis results against 
NOV37. Many regions of NOV37 show homology to the cadherin repeats and cadherin 
domain. This indicates that the NOV37 sequence has properties similar to those of other 
proteins known to contain these domains. 
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Table 37F: Domain Analysis of NOV37 

gnl 1 Smart 1 smart 00 112, CA, Cadherin repeats.; Cadherins are glycoproteins 
involved in Ca2+-mediated cell-cell adhesion. Cadherin domains occur as 
repeats in the extracellular regions which are thought to mediate cell -cell 
contact when bound to calcium. 

CD-Length = 82 residues, 100.0% aligned 

Score = 85.9 bits (211), Expect = 8e-18 


NOV37: 
Sbjct : 


473 
1 


VSATDRDSGTNAQVTySLLPPQDPHLPLSSLVSINADNGHLFALRSLDYEALQAFEFRVG 

Mill III 1 -Ulhl II Ih + 1 ^ ^111 * 1 

VSATDADSGENGKVTYSILSGNDGGL FSIDPETGIITTTKPLDREEQSEYTLTVE 


532 
55 


NOV37: 


533 


ATDRGSPALSSEALVRVLVLDANDNSP 559 (SEQ ID NO: 444) 
111 M nil II III ilhl 

ATDGGGPPLSSTATVTVTVLDVNDNAP 82 (SEQ ID NO:445) 




Sbjct: 


56 




gnl 1 Smart |smart001l2, CA, Cadherin repeats. 

CD-Length = 82 residues, 100.0% aligned 
Score = 81.6 bits (200), Expect = 2e-16 




N0V37: 
Sb j Ct : 


264 
1 


VSARDLDIGTMGEISYAFSQASEDIRKTFRLSAKSGELLLRQKLDFESIQTYTVNIQATD 

III M 1 ll+++k + 1 + ++I + +111 11+ ++III 

VSATDADSGENGKVTYSILS - - GNDGGLFSIDPETGIITTTKPLDREEQSEYTLTVEATD 


323 
58 


NOV37: 


324 


GGG- -LSGKSTVIVQWDVNDNPP 345 (SEQ ID NO: 44 6) 

Hi II +11 1 hlllli 1 

GGGPPLSSTATVTVTVLDVNDNAP 82 (SEQ ID NO:445) 




Sb j ct : 


59 




gnl 1 Smart [smart 00 112, CA, Cadherin repeats. 

CD-Length - 82 residues, 98.8% aligned 
Score = 79.0 bits (193), Expect = le-15 




NOV37: 
Sbjct: 


156 
2 


NAEDLDVGRNSLQNYTITPNSHFHVPTRSRRIXSRKYPELVIiNRALDREEQPBIRLTLTAL 

+ 11 M 1 hi + + 1 + + IIIIN 1 11+ I 
SATDADSGENGKVTYS ILSGNDGGLFSIDPETG 1 ITTTKPLDREEQSEYTLTVEAT 


215 
57 


NOV37: 


216 


DGGSPPRSGTALVRIEWDINDNVP 240 (SEQ ID NO: 447) 




Sbj Ct: 


58 


III 11 1 111 + l+l+lll 1 

DGGGPPLSSTATVTVTVLDVNDNAP 82 (SEQ ID NO: 44 8) 




gnl t Smart 1 smart 00112 , CA, Cadherin repeats 

CD-Length = 82 residues, 98.8% aligned 
Score = 68.9 bits (167), Expect - le-12 




NOV37: 
Sbj ct : 


369 
2 


SVSDLDSGDNGRVMCSIENNLPFPLKPSVENFYTLVSEGALDRETRSEYNITITITDLGT 

1 +1 iii+ii+i II + 1 + + nil +111 +1+ n 1 

SATDADSGE3SJGKVTYSILSGNDGGLFSIDPETGIITTTKPLDREEQSEYTLTVEATDGGG 


428 
61 


NOV37: 


429 


PRLKTKYNITVLVSDVNDNAP 449 (SEQ ID NO:449) 




Sbjct: 


62 


11+ HII iiiiiii 

PPLSSTATVTVTVLDVNDNAP 82 (SEQ ID NO: 448) 




gnl [Smart [smart 00112, CA, Cadherin repeats. 

CD-Length - 82 residues, 92.7% aligned 
Score - 65.1 bits (157), Expect - le-11 




NOV37t 
Sbjct: 


589 
1 


WAVDGDSGQNAWLSYQLLKATEPGLFGVWAHNGEVRTARLLRERDAAKQRLWLVKDNG 

1 1 1 lll+l ++I +1 + III + 1 + 1 + 1 + ++ 1 1 II 
VSATDADSGENGKVTYSILSGNDGGLFSIDPETGIITTTKPLDREEQSEYTLTVEATDGG 


648 
60 


NOV37: 


649 


EPPRSATATLHVLLVD 664 (SEQ ID N0:450) 
II 1+111+ 1 ++I 
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Sbjct: 


61 


GPPLSSTATVTVTVLD 76 (SBQ ID NO: 451) 




gnl |Pfam|pfam00028, cadherin, Cadherin domain. 

CD-Length = 92 residues, 98.9% aligned 
Score = 79.3 bits (194), Expect ^ 8e-16 




NOV37: 
Sbjct: 


247 
1 


YEVQIPEDSPVGSQVAIVSAKDLDIGTNGEISYAFSQASEDIRKTFRLSAKSGELLLRQK 

1 [^1 1 hi II 1 k 11+ +1 + 1 + 

YSAS VPENAPVGTEVLTVTATDADLGPNGRIFYS ILGGGPG — GWFRIDPDTGDLSTTKP 


306 
58 


NOV37: 
Sb j Ct : 


3 07 
59 


LDFESIQTYTVNIQATDGG--GLSGKSTVIVQV 337 (SEQ ID NO: 452) 
II III 1 + + III 1 III ^11 + 1 

LDRESIGEYELTVLATDSGGPPLSGTTTVTITV 91 (SEQ ID NO r 45 3) 




gnl|Pfam|pfam00028, cadherin, Cadherin domain. 

CD-Length = 92 residues, 100.0% aligned 
Score = 76.3 bits (186), Expect = 6e-15 




NOV37: 
Sbj Ct: 


456 
1 


YTLFVRENNSPALHIGSVSATDRDSGTNAQVTYSLLPPQDPHLPLSSLVSINADNGHLFA 

h Ml + +I + III 1 1 1 tl + l 1+ t M 

YSASVPENAPVGTEVLTVTATDADLGPNGRIFYSILGGGPGGW FRIDPDTGDLST 


515 
!>!> 


NOV37: 
Sbjct: 


516 
56 


LRSLDYEALQAFEFRVGATDRGSPALSSEALVRVLVL 552 (SEQ ID NO: 4 54) 

+ II ^1 Mil 1 1 II 1 + II 

TKPLDRESIGEYELTVLATDSGGPPLSGTTTVTITVL 92 (SEQ ID NO: 455) 




gnl |Pf am[pfam00028, cadherin, Cadherin domain. 

CD-Length = 92 residues, 92.4% aligned 
Score = 60.1 bits (144), Expect = 5e-10 




N0V3 7 : 
Sbjct : 


576 
5 


VPRAAEPGYLVTKVVAVDGDSGQNAWLSYQLLKATEPGLFGVWAHNGEVRTARLLRERDA 

II 1 1 1 M M M M +1 1 M I++ 1 + 1 

VPENAPVGTEVLTVTATDADLGPNGRIFYSILGGGPGGWFRIDPDTGDLSTTKPLDRESI 


635 


N0V37: 


636 


AKQRLWLVKDNGEPPRSATATLHV 660 (SEQ ID NO: 456) 




Sbjct: 


65 


+ Ml l + IIM 11+ + 
GEYELTVLATDSGGPPLSGTTTVTI 89 (SEQ ID NO: 457) 




gnl |Pfam|pfam00028, cadherin, Cadherin domain. 

CD-Length = 92 residues, 94.6% aligned 
Score =s 58.5 bits (140), Expect = le-09 




NOV37: 
Sbjct: 


142 
5 


ILESTLPGTVIPLC^AEDLDVGRNSLQNYTITPNSHFHVPTRSRRDGRKYPELVLNRALD 

M+ II + M l + M l + l 1 +1 + II 

VPENAPVGTEVLTVTATDADLGPNGRIFYSILGGGPGGWFRIDPDTG DLSTTKPLD 


201 


NOV37: 


202 


REEQPEIRLTLTALDGGSPPRSGTALVRIEV 232 (SEQ ID NO: 458) 




Sbjct: 


61 


II 1 11+ i II II III III 

RESIGEYELTVLATDSGGPPLSGTTTVTITV 91 (SEQ ID NO: 459) 




gnl 1 Pfara|pfam00028, cadherin, Cadherin domain. 

CD-Length 92 residues, 94.6% aligned 
Score = 48.5 bits (114), Expect = le-06 




NOV37: 
Sbj Ct : 


356 

5 


IPENSGE-TVLAVFSVSDLDSGDNGRVMCSIENNLPFFLKPSVENFYTLVSE G 

+ 111+ 1 + + +11 MII+ II 1 + 
VPENAPVGTEVLTVTATDADLGPNGRIFYSI LGGGPGGWFRIDPDTGDLSTTK 


407 
57 


N0V37: 
Sbj ct I 


408 
58 


ALDRETRSEYNITITITDLGTPRLKTKYNITVLV 441 (SEQ ID NO: 460) 

1111+ 1! +1+ III M +1+1 

PLDRESIGEYELTVLATDSGGPPLSGTTTVTITV 91 (SEQ ID NO: 459) 
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Cadherins play important roles in specific cell-cell adhesion events and also mediate 
interactions between cells and the extracellular matrix. They are required for morphogenesis, 
mediation of neural cell-cell interactions, and the regulation of immune cell responses (Nollet 
et aL, (2000) J. MoL Biol. 299: 551-72; Henricks and Nijkamp (1998) Eur. J. Pharmacol. 
344: 1-13). Many diseases have been shown to be associated with dysfunction of or with 
overexpression of adhesion molecules. For example, improper cadherin levels have been 
observed in human cancer malignancies and are thought to lead to cancer cell invasion and 
metastasis (Nollet et al., (2000) J. MoL Biol. 299: 551-72). It has also been demonstrated that 
anti-adhesion treatment can lead to diminished infiltration and activation of inflammatory 
immune cells resulting in decreased tissue injury and malfunction (Henricks and Nijkamp 
(1998) Eur. J. Pharmacol. 344: 1-13). 

The cadherins form a superfamily of calcium-dependent cell-cell adhesion molecules 
that can be divided into at least six subfamilies, one of which is known as the protocadherin 
subfamily (Nollet et al., (2000) J. Mol. Biol. 299: 551-72). Wu and Maniatis identified 52 
novel cadherin-like genes, including protocadherin beta 3, on human chromosome 5q31 (Wu 
and Maniatis (1999) Cell 97: 779-790). The gene described in this invention is a homolog of 
protocadhem beta 3 and is expressed in the brain, suggesting that it may be involved in neural 
cell interactions and play a role in diseases of the central nervous system. Furthermore, based 
on observations from the other cadherin family members, the protocadherin beta 3-Iike gene 
may also be involved in cancer or immunological disorders, among other diseases. The 
protocadherin beta 3-like gene maps to human chromosome 5. 

The protein similarity information, expression pattern, cellular localization, and map 
location for the NOV37 protein and nucleic acid disclosed herein suggest that this 
Protocadherin beta 3-like protein may have important structural and/or physiological 
functions characteristic of the Cadherin family. Therefore, the nucleic acids and proteins of 
the invention are useful in potential diagnostic and therapeutic applications and as a research 
tool. These include serving as a specific or selective nucleic acid or protein diagnostic and/or 
prognostic marker, wherein the presence or amount of the nucleic acid or the protein are to be 
assessed. These also include potential therapeutic applications such as the following: (i) a 
protein therapeutic, (ii) a small molecule drug target, (iii) an antibody target (therapeutic, 
diagnostic, drug targeting/cytotoxic antibody), (iv) a nucleic acid useful in gene therapy (gene 
delivery/gene ablation), (v) an agent promoting tissue regeneration in vitro and in vivo, and 
(vi) a biological defense weapon. 
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The NOV37 nucleic acids and proteins of the invention have applications in the 
diagnosis and/or treatment of various diseases and disorders. For example, the compositions 
of the present invention will have eflBcacy for the treatment of patients suffering from: 
cancer, trauma, bacterial and viral infections, in vitro and in vivo regeneration. Von Hippel- 
Lindau (VHL) syndrome, Alzheimer's disease, stroke, tuberous sclerosis, hypercalceimia, 
Parkinson's disease, Huntington's disease, cerebral palsy, epilepsy, Lesch-Nyhan syndrome, 
multiple sclerosis, ataxia-telangiectasia, leukodystrophies, behavioral disorders, addiction, 
anxiety, pain, neurodegeneration, arthritis, tendonitis, cardiomyopathy, atherosclerosis, 
hypertension, congenital heart defects, aortic stenosis, atrial septal defect (ASD), 
atrioventricular (A-V) canal defect, ductus arteriosus, pulmonary stenosis, subaortic stenosis, 
ventricular septal defect (VSD), valve diseases, tuberous sclerosis, scleroderma, obesity, 
transplantation, ulcers, fertility, cystitis, incontinence, and endometriosis as well as other 
diseases, disorders and conditions. 

These antibodies may be generated according to methods known in the art, using 
prediction from hydrophobicity charts, as described in the "Anti-NOVX Antibodies" section 
below. The disclosed NOV37 protein has multiple hydrophilic regions, each of which can be 
used as an immunogen. In one embodiment, a contemplated NOV37 epitope is from about 
amino acids 20 to 30. In another embodiment, a contemplated NOV37 epitope is from about 
amino acids 80 to 105. In other specific embodiments, contemplated NOV37 epitopes are 
from about amino acids 110 to 120, 175 to 2 10, 240 to 245, 280 to 320, 330 to 335, 390 to 
395, 400 to 435, 470 to 490, 5 10 to 530, 575 to 635 and 720 to 790. 

NOVX Nucleic Acros and Polypeptides 

One aspect of the invention pertains to isolated nucleic acid molecules that encode 
NOVX polypeptides or biologically active portions thereof. Also included in the invention 
are nucleic acid fragments sufficient for use as hybridization probes to identify NOVX- 
encoding nucleic acids (eg^., NOVX mRNAs) and fragments for use as PGR primers for the 
amplification and/or mutation of NOVX nucleic acid molecules. As used herein, the term 
"nucleic acid molecule" is intended to include DNA molecules (e.g., cDNA or genomic 
DNA), RNA molecules (e.g., mRNA), analogs of the DNA or RNA generated using 
nucleotide analogs, and derivatives, fragments and homologs thereof. The nucleic acid 
molecule may be single-stranded or double-stranded, but preferably is comprised double- 
stranded DNA. 
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An NOVX nucleic acid can encode a mature NOVX polypeptide. As used herein, a 
"mature" form of a polypeptide or protein disclosed in the present invention is the product of 
a naturally occurring polypeptide or precursor form or proprotein. The naturally occurring 
polypeptide, precursor or proprotein includes, by way of nonlimiting example, the full-length 
gene product, encoded by the corresponding gene. Alternatively, it may be defined as the 
polypeptide, precursor or proprotein encoded by an ORF described herein. The product 
"mature" form arises, again by way of nonlimiting example, as a result of one or more 
naturally occurring processing steps as they may take place within the cell, or host cell, in 
which the gene product arises. Examples of such processing steps leading to a "mature" form 
of a polypeptide or protein include the cleavage of the N-terminal methionine residue 
encoded by the initiation codon of an ORF, or the proteolytic cleavage of a signal peptide or 
leader sequence. Thus a mature form arising from a precursor polypeptide or protein that has 
residues 1 to N, where residue 1 is the N-terminal methionine, would have residues 2 through 
N remaining after removal of the N-terminal methionine. Alternatively, a mature form 
arising from a precursor polypeptide or protein having residues 1 to N, in which an N- 
terminal signal sequence from residue 1 to residue M is cleaved, would have the residues 
from residue M+1 to residue N remaining. Further as used herein, a "mature" form of a 
polypeptide or protein may arise from a step of post-translational modification other than a 
proteolytic cleavage event. Such additional processes include, by way of non-limiting 
example, glycosylation, myristoylation or phosphorylation. In general, a mature polypeptide 
or protein may result from the operation of only one of these processes, or a combination of 
any of them. 

The term "probes", as utilized herein, refers to nucleic acid sequences of variable 
length, preferably between at least about 10 nucleotides (nt), 100 nt, or as many as 
approximately, e.g., 6,000 nt, depending upon the specific use. Probes are used in the 
detection of identical, similar, or complementary nucleic acid sequences. Longer length 
probes are generally obtained from a natural or recombinant source, are highly specific, and 
much slower to hybridize than shorter-length oligomer probes. Probes may be single- or 
double-stranded and designed to have specificity in PCR, membrane-based hybridization 
technologies, or ELlSA-like technologies. 

The term "isolated" nucleic acid molecule, as utilized herein, is one, which is 

separated from other nucleic acid molecules which are present in the natural source of the 

nucleic acid. Preferably, an "isolated" nucleic acid is free of sequences which naturally flank 

the nucleic acid (i.e., sequences located at the 5*- and 3 -termini of the nucleic acid) in the 
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genomic DNA of the organism fiom which the nucleic acid is derived. For example, in 
various embodiments, the isolated NOVX nucleic acid molecules can contain less than about 
5 kb, 4 kb, 3 kb, 2 kb, 1 kb, 0.5 kb or 0.1 kb of nucleotide sequences which naturally flank 
the nucleic acid molecule in genomic DNA of the cell/tissue from which the nucleic acid is 
derived (e.g., brain, heart, liver, spleen, etc.). Moreover, an "isolated" nucleic acid molecule, 
such as a cDNA molecule, can be substantially free of other cellular material or culture 
medium when produced by recombinant techniques, or of chemical precursors or other 
chemicals when chemically synthesized. 

A nucleic acid molecule of the mvention, e.g., a nucleic acid molecule having the 
nucleotide sequence SEQ ID NOS:l, 3, 5, 7, 9, 1 1, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 
35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77, 79, 81, 83, 
85, 87, 89, 91, 93, 95, 97, 99, 101, 103, 105, 107, 109 and 111, or a complement of this 
aforementioned nucleotide sequence, can be isolated using standard molecular biology 
techniques and the sequence information provided herein. Using all or a portion of the 
nucleic acid sequence of SEQ ID NOSrl, 3, 5, 7, 9, 1 1, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 
33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77, 79, 81, 
83, 85, 87, 89, 91, 93, 95, 97, 99, 101, 103, 105, 107, 109 and 111 as a hybridization probe, 
NOVX molecules can be isolated using standard hybridization and cloning techniques (e.g., 
as described in Sambrook, et al, (eds.), MOLECULAR CLONING: A LABORATORY MANUAL 2"** 
Ed., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, 1989; and Ausubel, et 
al, (edsO, Current Protocols in Molecular Biology, John Wiley & Sons, New York, 
NY, 19930 

A nucleic acid of the invention can be amplified using cDNA, mRNA or alternatively, 
genomic DNA, as a template and appropriate oligonucleotide primers according to standard 
PCR amplification techniques. The nucleic acid so amplified can be cloned into an 
appropriate vector and characterized by DNA sequence analysis. Furthermore, 
oligonucleotides corresponding to NOVX nucleotide sequences can be prepared by standard 
synthetic techniques, e.g., using an automated DNA synthesizer. 

As used herein, the term "oligonucleotide" refers to a series of linked nucleotide 
residues, which oligonucleotide has a sufficient number of nucleotide bases to be used in a 
PCR reaction. A short oligonucleotide sequence may be based on, or designed from, a 
genomic or cDNA sequence and is used to amplify, confirm, or reveal the presence of an 
identical, similar or complementary DNA or RNA in a particular cell or tissue. 
Oligonucleotides comprise portions of a nucleic acid sequence having about 1 0 nt, 50 nt, or 
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100 nt in length, preferably about 1 5 nt to 30 nt in length. In one embodiment of the 
invention, an oligonucleotide comprising a nucleic acid molecule less than 100 nt in length 
would further comprise at least 6 contiguous nucleotides SEQ ID NOS:l, 3, 5, 7, 9, 1 1, 13, 
15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 
65, 67, 69, 71, 73, 75, 77, 79, 81, 83, 85, 87, 89, 91, 93, 95, 97, 99, 101, 103, 105, 107, 109 
and 1 1 1, or a complement thereof. Oligonucleotides may be chemically synthesized and may 
also be used as probes. 

In another embodiment, an isolated nucleic acid molecule of the invention comprises 
a nucleic acid molecule that is a complement of the nucleotide sequence shown in SEQ ID 
NOS:l, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 
51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77, 79, 81, 83, 85, 87, 89, 91, 93, 95, 97, 99, 
101, 103, 105, 107, 109 and 1 1 1, or a portion of this nucleotide sequence (e.g., a fragment 
that can be used as a probe or primer or a fragment encoding a biologically-active portion of 
an NOVX polypeptide). A nucleic acid molecule that is complementary to the nucleotide 
sequence shown SEQ ID NOS:l, 3, 5, 7, 9, 1 1, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 
37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77, 79, 81, 83, 85, 
87, 89, 91, 93, 95, 97, 99, 101, 103, 105, 107, 109 or 1 1 1 is one that is sufficiently 
complementary to the nucleotide sequence shovm SEQ ID NOS:l, 3, 5, 7, 9, 1 1, 13, 15, 17, 
19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 
69, 71, 73, 75, 77, 79, 81, 83, 85, 87, 89, 91, 93, 95, 97, 99, 101, 103, 105, 107, 109 or 1 1 1 
that it can hydrogen bond with little or no mismatches to the nucleotide sequence shown SEQ 
IDNOS:l,3, 5, 7, 9, 11, 13, 15, 17, 19,21,23, 25,27, 29,31,33,35, 37, 39,41,43,45, 47, 
49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77, 79, 81, 83, 85, 87, 89, 91, 93, 95, 97, 
99, 101, 103, 105, 107, 109 and 111, thereby forming a stable duplex. 

As used herein, the term "complementary" refers to Watson-Crick or Hoogsteen base 
pairing between nucleotides units of a nucleic acid molecule, and the term "binding" means 
the physical or chemical interaction between two polypeptides or compounds or associated 
polypeptides or compounds or combinations thereof. Binding includes ionic, non-ionic, van 
der Waals, hydrophobic interactions, and the like. A physical interaction can be either direct 
or indirect. Indirect interactions may be through or due to the effects of another polypeptide 
or compound. Direct binding refers to interactions that do not take place through, or due to, 
the effect of another polypeptide or compound, but instead are without other substantial 
chemical intermediates. 
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Fragments provided herein are defined as sequences of at least 6 (contiguous) nucleic 
acids or at least 4 (contiguous) amino acids, a length sufficient to allow for specific 
hybridization in the case of nucleic acids or for specific recognition of an epitope in the case 
of amino acids, respectively, and are at most some portion less than a full length sequence. 
Fragments may be derived from any contiguous portion of a nucleic acid or amino acid 
sequence of choice. Derivatives are nucleic acid sequences or amino acid sequences formed 
from the native compounds either directly or by modification or partial substitution. Analogs 
are nucleic acid sequences or amino acid sequences that have a structure similar to, but not 
identical to, the native compound but differs from it in respect to certain components or side 
chains. Analogs may be synthetic or from a different evolutionary origin and may have a 
similar or opposite metabolic activity compared to wild type. Homologs are nucleic acid 
sequences or amino acid sequences of a particular gene that are derived from different 
species. 

Derivatives and analogs may be full length or other than full length, if the derivative 
or analog contains a modified nucleic acid or amino acid, as described below. Derivatives or 
analogs of the nucleic acids or proteins of the invention include, but are not limited to, 
molecules comprising regions that are substantially homologous to the nucleic acids or 
proteins of the invention, in various embodiments, by at least about 70%, 80%, or 95% 
identity (with a preferred identity of 80-95%) over a nucleic acid or amino acid sequence of 
identical size or when compared to an aligned sequence in which the alignment is done by a 
computer homology program known in the art, or whose encoding nucleic acid is capable of 
hybridizing to the complement of a sequence encoding the aforementioned proteins under 
stringent, moderately stringent, or low stringent conditions. See e.g. Ausubel, et aL, 
Current Protocols in Molecular Biology, John Wiley & Sons, New York, NY, 1993, 
and below. 

A "homologous nucleic acid sequence" or "homologous amino acid sequence," or 
variations thereof, refer to sequences characterized by a homology at the nucleotide level or 
amino acid level as discussed above. Homologous nucleotide sequences encode those 
sequences coding for isoforms of NOVX polypeptides. Isoforms can be expressed in 
different tissues of the same organism as a result of, for example, alternative splicing of 
RNA. Alternatively, isoforms can be encoded by different genes. In the invention, 
homologous nucleotide sequences include nucleotide sequences encoding for an NOVX 
polypeptide of species other than humans, including, but not limited to: vertebrates, and thus 

can include, e.g., frog, mouse, rat, rabbit, dog, cat cow, horse, and other organisms, 
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Homologous nucleotide sequences also include, but are not limited to, naturally occurring 
allelic variations and mutations of the nucleotide sequences set forth herein. A homologous 
nucleotide sequence does not, however, include the exact nucleotide sequence encoding 
human NOVX protein. Homologous nucleic acid sequences include those nucleic acid 
sequences that encode conservative amino acid substitutions (see below) in SEQ ID NOS:l, 
3, 5, 7, 9, 1 1, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 
55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77, 79, 81, 83, 85, 87, 89, 91, 93, 95, 97, 99, 101, 
103, 105, 107, 109 and 1 1 1, as well as a polypeptide possessing NOVX biological activity. 
Various biological activities of the NOVX proteins are described below. 

An NOVX polypeptide is encoded by the open reading frame ("ORF") of an NOVX 
nucleic acid. An ORF corresponds to a nucleotide sequence that could potentially be 
translated into a polypeptide. A stretch of nucleic acids comprising an ORF is uninterrupted 
by a stop codon. An ORF that represents the coding sequence for a full protein begins with 
an ATG "start" codon and terminates with one of the three "stop" codons, namely, TAA, 
TAG, or TGA. For the purposes of this invention, an ORF may be any part of a coding 
sequence, with or without a start codon, a stop codon, or both. For an ORF to be considered 
as a good candidate for coding for a bona fide cellular protein, a minimum size requirement is 
often set, e.g., a stretch of DNA that would encode a protein of 50 amino acids or more. 

The nucleotide sequences determined from the cloning of the human NOVX genes 
allows for the generation of probes and primers designed for use in identifying and/or cloning 
NOVX homologues in other cell types, e.g. from other tissues, as well as NOVX homologues 
from other vertebrates. The probe/primer typically comprises substantially purified 
oligonucleotide. The oligonucleotide typically comprises a region of nucleotide sequence 
that hybridizes under stringent conditions to at least about 12, 25, 50, 100, 150, 200, 250, 
300, 350 or 400 consecutive sense strand nucleotide sequence SEQ ID NOS:l, 3, 5, 7, 9, 1 1, 
13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 
63, 65, 67, 69, 71, 73, 75, 77, 79, 81, 83, 85, 87, 89, 91, 93, 95, 97, 99, 101, 103, 105, 107, 
109 and 1 1 1 ; or an anti-sense strand nucleotide sequence of SEQ ID NOS:l, 3, 5, 7, 9, 1 1, 13, 
15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 
65, 67, 69, 71, 73, 75, 77, 79, 81, 83, 85, 87, 89, 91, 93, 95, 97, 99, 101, 103, 105, 107, 109 
and 1 1 1; or of a naturally occurring mutant of SEQ ID NOS:l, 3, 5, 7, 9, 1 1, 13, 15, 17, 19, 
21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 
71, 73, 75, 77, 79, 81, 83, 85, 87, 89, 91, 93, 95, 97, 99, 101, 103, 105, 107, 109 and 1 1 1. 
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Probes based on the human NOVX nucleotide sequences can be used to detect 
transcripts or genomic sequences encoding the same or homologous proteins. In various 
embodiments, the probe further comprises a label group attached thereto, e.g, the label group 
can be a radioisotope, a fluorescent compound, an enzyme, or an enzyme co-factor. Such 
probes can be used as a part of a diagnostic test kit for identifying cells or tissues which mis- 
express an NOVX protein, such as by measuring a level of an NOVX-encoding nucleic acid 
in a sample of cells from a subject e.g., detecting NOVX mRNA levels or determining 
whether a genomic NOVX gene has been mutated or deleted. 

"A polypeptide having a biologically-active portion of an NOVX polypeptide" refers 
to polypeptides exhibiting activity similar, but not necessarily identical to, an activity of a 
polypeptide of the invention, including mature forms, as measured in a particular biological 
assay, with or without dose dependency. A nucleic acid fragment encoding a "biologically- 
active portion of NOVX" can be prepared by isolating a portion SEQ ID NOS:l, 3, 5, 7, 9, 
1 1, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 
61, 63, 65, 67, 69, 71, 73, 75, 77, 79, 81, 83, 85, 87, 89, 91, 93, 95, 97, 99, 101, 103, 105, 
107, 109 and 111, that encodes a polypeptide having an NOVX biological activity (the 
biological activities of the NOVX proteins are described below), expressing the encoded 
portion of NOVX protein (e.g., by recombinant expression in vitro) and assessing the activity 
of the encoded portion of NOVX. 

NOVX Nucleic Acid and Polypeptide Variants 

The invention fiarther encompasses nucleic acid molecules that differ from the 
nucleotide sequences shown in SEQ ID NOSrl, 3, 5, 7, 9, 1 1, 13, 15, 17, 19, 21, 23, 25, 27, 
29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77, 
79, 81, 83, 85, 87, 89, 91, 93, 95, 97, 99, 101, 103, 105, 107, 109 and 1 1 1 due to degeneracy 
of the genetic code and thus encode the same NOVX proteins as that encoded by the 
nucleotide sequences shown in SEQ ID NOS:l, 3, 5, 7, 9, 11,13, 15, 17, 19, 21, 23, 25, 27, 
29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77, 
79, 81, 83, 85, 87, 89, 91, 93, 95, 97, 99, 101, 103, 105, 107, 109 and 1 1 1. In another 
embodiment, an isolated nucleic acid molecule of the invention has a nucleotide sequence 
encoding a protein having an amino acid sequence shown in SEQ ID NOS:2, 4, 6, 8, 10, 12, 
14, 16, 1 8, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 
64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102, 104, 106, 108, 
110, and 112. 
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In addition to the human NOVX nucleotide sequences shown in SEQ ID NOS: 1, 3, 5, 

7, 9, 1 1, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 

57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77, 79, 81, 83, 85, 87, 89, 91, 93, 95, 97, 99, 101, 103, 

105, 107, 109 and 1 1 1, it will be appreciated by those skilled in the art that DNA sequence 

polymorphisms that lead to changes in the amino acid sequences of the NOVX polypeptides 

may exist within a population (e.g., the human population). Such genetic polymorphism in 

the NOVX genes may exist among individuals within a population due to natural allelic 

variation. As used herein, the terms "gene" and "recombinant gene" refer to nucleic acid 

molecules comprising an open reading frame (ORF) encoding an NOVX protein, preferably a 

vertebrate NOVX protein. Such natural allelic variations can typically result in 1-5% 

variance in the nucleotide sequence of the NOVX genes. Any and all such nucleotide 

variations and resulting amino acid polymorphisms in the NOVX polypeptides, which are the 

result of natural allelic variation and that do not alter the functional activity of the NOVX 

polypeptides, are intended to be within the scope of the invention. 

Moreover, nucleic acid molecules encoding NOVX proteins frcmi other species, and 

thus that have a nucleotide sequence that differs from the human SEQ ID NOS:l, 3, 5, 7, 9, 

1 1, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 

61, 63, 65, 67, 69, 71, 73, 75, 77, 79, 81, 83, 85, 87, 89, 91, 93, 95, 97, 99, 101, 103, 105, 

107, 109 and 1 1 1 are intended to be within the scope of the invention. Nucleic acid 

molecules corresponding to natural allelic variants and homologues of the NOVX cDNAs of 

the invention can be isolated based on their homology to the human NOVX nucleic acids 

disclosed herein using the human cDNAs, or a portion thereof, as a hybridization probe 

according to standard hybridization techniques under stringent hybridization conditions. 

Accordingly, in another embodiment, an isolated nucleic acid molecule of the 

invention is at least 6 nucleotides in length and hybridizes under stringent conditions to the 

nucleic acid molecule comprising the nucleotide sequence of SEQ ID NOS:l, 3, 5, 7, 9, 1 1, 

13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 

63, 65, 67, 69, 71, 73, 75, 77, 79, 81, 83, 85, 87, 89, 91, 93, 95, 97, 99, 101, 103, 105, 107, 

109 and 111. In another embodiment, the nucleic acid is at least 10, 25, 50, 100, 250, 500, 

750, 1000, 1500, or 2000 or more nucleotides in length. In yet another embodiment, an 

isolated nucleic acid molecule of the invention hybridizes to the coding region. As used 

herein, the term "hybridizes under stringent conditions" is intended to describe conditions for 

hybridization and washing under which nucleotide sequences at least 60% homologous to 

each other typically remain hybridized to each other. 
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Homologs (le,, nucleic acids encoding NOVX proteins derived from species other 
than human) or other related sequences (e.g., paralogs) can be obtained by low, moderate or 
high stringency hybridization with all or a portion of the particular human sequence as a 
probe using methods well known in the art for nucleic acid hybridization and cloning. 

As used herein, the phrase "stringent hybridization conditions" refers to conditions 
under which a probe, primer or oligonucleotide will hybridize to its target sequence, but to no 
other sequences. Stringent conditions are sequence-dependent and will be different in 
different circumstances. Longer sequences hybridize specifically at higher temperatures than 
shorter sequences. Generally, stringent conditions are selected to be about 5 °C lower than the 
thermal melting point (Tm) for the specific sequence at a defined ionic strength and pH. The 
Tm is the temperature (under defined ionic strength, pH and nucleic acid concentration) at 
which 50% of the probes complementary to the target sequence hybridize to the target 
sequence at equilibrium. Since the target sequences are generally present at excess, at Tm, 
50% of the probes are occupied at equilibrium. Typically, stringent conditions will be those 
in which the salt concentration is less than about 1.0 M sodium ion, typically about 0.01 to 
1 .0 M sodium ion (or other salts) at pH 7.0 to 8.3 and the temperature is at least about 30 
for short probes, primers or oligonucleotides (e.g., 10 nt to 50 nt) and at least about 60 °C for 
longer probes, primers and oligonucleotides. Stringent conditions may also be achieved with 
the addition of destabilizing agents, such as formamide. 

Stringent conditions are known to those skilled in the art and can be found in Ausubel, 
et al, (eds.), CURRENT PROTOCOLS IN MOLECULAR BIOLOGY, John Wiley & Sons, N.Y. 
(1989), 6.3.1-6.3.6. Preferably, the conditions are such that sequences at least about 65%, 
70%, 75%, 85%, 90%, 95%, 98%, or 99% homologous to each other typically remain 
hybridized to each other. A non-limiting example of stringent hybridization conditions are 
hybridization in a high salt buffer comprising 6X SSC, 50 mM Tris-HCl (pH 7.5), 1 mM 
EDTA, 0.02% PVP, 0.02% FicoU, 0.02% BSA, and 500 mg/ml denatured salmon sperm 
DNA at 65X, followed by one or more washes in 0.2X SSC, 0.01% BSA at 50 X. An 
isolated nucleic acid molecule of the invention that hybridizes under stringent conditions to 
the sequences SEQ IDNOSrl, 3, 5, 7, 9, 11, 13, 15, 17, 19,21,23, 25, 27,29,31,33, 35, 37, 
39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77, 79, 81, 83, 85, 87, 
89, 91, 93, 95, 97, 99, 101, 103, 105, 107, 109 and 1 1 1, corresponds to a naturally-occurring 
nucleic acid molecule. As used herein, a "naturally-occurring" nucleic acid molecule refers 
to an RNA or DNA molecule having a nucleotide sequence that occurs in nature (e.g:, 
encodes a natural protein). 
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In a second embodiment, a nucleic acid sequence that is hybridizable to the nucleic 
acid molecule comprising the nucleotide sequence of SEQ ID NOS: 1 , 3, 5, 7, 9, 1 1 , 1 3, 1 5, 
17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 
67, 69, 71, 73, 75, 77, 79, 81, 83, 85, 87, 89, 91, 93, 95, 97, 99, 101, 103, 105, 107, 109 and 
1 1 1 , or fragments, analogs or derivatives thereof, under conditions of moderate stringency is 
provided. A non-limiting example of moderate stringency hybridization conditions are 
hybridization in 6X SSC, 5X Denhardt's solution, 0.5% SDS and 100 mg/ml denatured 
salmon sperm DNA at 55''C, followed by one or more washes in IX SSC, 0.1% SDS at 37'*C. 
Other conditions of moderate stringency that may be used are well-known within the art. See, 
e.g., Ausubel, et al (eds.), 1993, CURRENT PROTOCOLS IN MOLECULAR BIOLOGY, John Wiley 
& Sons, NY, and Kriegler, 1990; GENE Transfer AND EXPRESSION, A Laboratory 
Manual, Stockton Press, NY. 

In a third embodiment, a nucleic acid that is hybridizable to the nucleic acid molecule 
comprising the nucleotide sequences SEQ ID NOS: 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 
27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 
77, 79, 81, 83, 85, 87, 89, 91, 93, 95, 97, 99, 101, 103, 105, 107, 109 and 111, or fragments, 
analogs or derivatives thereof, under conditions of low stringency, is provided. A 
non-limiting example of low stringency hybridization conditions are hybridization in 35% 
formamide, 5X SSC, 50 mM Tris-HCl (pH 7.5), 5 mM EDTA, 0.02% PVP, 0.02% Ficoll, 
0.2% BSA, 100 mg/ml denatured salmon sperm DNA, 10% (wt/vol) dextran sulfate at 40X, 
followed by one or more washes in 2X SSC, 25 mM Tris-HCl (pH 7.4), 5 mM EDTA, and 
0.1% SDS at 50X. Other conditions of low stringency that may be used are well known in 
the art (e.g., as employed for cross-species hybridizations). See^ e.g., Ausubel, et al (eds.), 
1 993, Current Protocols in Molecular Biology, John Wiley & Sons, NY, and Kriegler, 
1990, Gene Transfer and Expression, A Laboratory Manual, Stockton Press, NY; 
Shilo and Weinberg, 1981. Proc Natl Acad Sci USA 78: 6789-6792. 

Conservative Mutations 

In addition to naturally-occurring allelic variants of NOVX sequences that may exist 
in the population, the skilled artisan will further appreciate that changes can be introduced by 
mutation into the nucleotide sequences SEQ ID NOS: 1, 3, 5, 7, 9, 1 1, 13, 15, 17, 19, 21, 23, 
25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 
75, 77, 79, 81, 83, 85, 87, 89, 91, 93, 95, 97, 99, 101, 103, 105, 107, 109 and 1 1 1, thereby 
leading to changes in the amino acid sequences of the encoded NOVX proteins, without 
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altering the functional ability of said NOVX proteins. For example, nucleotide substitutions 
leading to amino acid substitutions at "non-essential" amino acid residues can be made in the 
sequence SEQ ID NOS:2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 
42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 
92, 94, 96, 98, 100, 102, 104, 106, 108, 1 10, and 1 12. A "non-essential" amino acid residue 
is a residue that can be altered from the wild-type sequences of the NOVX proteins without 
altering their biological activity, whereas an "essential" amino acid residue is required for 
such biological activity. For example, amino acid residues that are conserved among the 
NOVX proteins of the invention are predicted to be particularly non-amenable to alteration. 
Amino acids for which conservative substitutions can be made are well-known within the art. 

Another aspect of the invention pertains to nucleic acid molecules encoding NOVX 
proteins that contain changes in amino acid residues that are not essential for activity. Such 
NOVX proteins differ in amino acid sequence from SEQ ID NOS:l, 3, 5, 7, 9, 11, 13, 15, 17, 
19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 

69, 71, 73, 75, 77, 79, 81, 83, 85, 87, 89, 91, 93, 95, 97, 99, 101, 103, 105, 107, 109 and 1 1 1 
yet retain biological activity. In one embodiment, the isolated nucleic acid molecule 
comprises a nucleotide sequence encoding a protein, wherein the protein comprises an amino 
acid sequence at least about 45% homologous to the amino acid sequences SEQ ID NOS:2, 4, 
6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 
56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102, 
104, 106, 108, 1 10, and 1 12. Preferably, the protein encoded by the nucleic acid molecule is 
at least about 60% homologous to SEQ ID NOS:2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 
28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 
78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102, 104, 106, 108, 110, and 112; more 
preferably at least about 70% homologous SEQ ID NOS:2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 
24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 
74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102, 104, 106, 108, 110, and 1 12; still 
more preferably at least about 80% homologous to SEQ ID NOS:2, 4, 6, 8, 10, 12, 14, 1 6, 1 8, 
20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 

70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102, 104, 106, 108, 110, and 

1 12; even more preferably at least about 90% homologous to SEQ ID NOS:2, 4, 6, 8, 10, 12, 
14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 
64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102, 104, 106, 108, 
1 10, and 1 12; and most preferably at least about 95% homologous to SEQ ID NOS:2, 4, 6, 8, 
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10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 
60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102, 104, 
106, 108, 110, and 112. 

An isolated nucleic acid molecule encoding an NOVX protein homologous to the 
protein of SEQ ID NOS:2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 
40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 
90, 92, 94, 96, 98, 100, 102, 104, 106, 108, 110, and 1 12 can be created by introducing one or 
more nucleotide substitutions, additions or deletions into the nucleotide sequence of SEQ ID 
NOSrl, 3, 5, 7, 9, 1 1, 1 3, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 
51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77, 79, 81, 83, 85, 87, 89, 91, 93, 95, 97, 99, 
101, 103, 105, 107, 109 and 1 1 1, such that one or more amino acid substitutions, additions or 
deletions are introduced into the encoded protein. 

Mutations can be introduced intoSEQIDNOS:!, 3, 5, 7, 9, 11, 13, 15, 17, 19,21,23, 
25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 
75, 77, 79, 81, 83, 85, 87, 89, 91, 93, 95, 97, 99, 101, 103, 105, 107, 109 and 1 11 by standard 
techniques, such as site-directed mutagenesis and PCR-mediated mutagenesis. Preferably, 
conservative amino acid substitutions are made at one or more predicted, non-essential amino 
acid residues. A "conservative amino acid substitution" is one in which the amino acid 
residue is replaced with an amino acid residue having a similar side chain. Families of amino 
acid residues having similar side chains have been defined within the art. These families 
include amino acids with basic side chains (e.g., lysine, arginine, histidine), acidic side chains 
(e.g,, aspartic acid, glutamic acid), uncharged polar side chains (e,g, glycine, asparagine, 
glutamine, serine, threonine, tyrosine, cysteine), nonpolar side chains (e,g, alanine, valine, 
leucine, isoleucine, proline, phenylalanine, methionine, tryptophan), beta-branched side 
chains (e,g, threonine, valine, isoleucine) and aromatic side chains (e.g., tyrosine, 
phenylalanine, tryptophan, histidine). Thus, a predicted non-essential amino acid residue in 
the NOVX protein is replaced with another amino acid residue from the same side chain 
family. Alternatively, in another embodiment, mutations can be introduced randomly along 
all or part of an NOVX coding sequence, such as by saturation mutagenesis, and the resultant 
mutants can be screened for NOVX biological activity to identify mutants that retain activity. 
Following mutagenesis SEQ IDNOS:l, 3, 5, 7,9, 11, 13, 15, 17, 19,21,23,25,27,29,31, 
33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77, 79, 81, 
83, 85, 87, 89, 91, 93, 95, 97, 99, 101, 103, 105, 107, 109 and 1 1 1, the encoded protein can 
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be expressed by any recombinant technology known in the art and the activity of the protein 
can be determined. 

The relatedness of amino acid families may also be determined based on side chain 
interactions. Substituted amino acids may be fully conserved "strong" residues or fully 
conserved "weak" residues. The "strong" group of conserved amino acid residues may be any 
one of the following groups: STA, NEQK, NHQK, NDEQ, QHRK, MILV, MILF, HY, 
FYW, wherein the single letter amino acid codes are grouped by those amino acids that may 
be substituted for each other. Likewise, the "weak" group of conserved residues may be any 
one of the following: CSA, ATV, SAG, STNK, STPA, SGND, SNDEQK, NDEQHK, 
NEQHRK, HFY, wherein the letters within each group represent the single letter amino acid 
code. 

In one embodiment, a mutant NOVX protein can be assayed for (i) the ability to form 
proteinrprotein interactions with other NOVX proteins, other cell-surface proteins, or 
biologically-active portions thereof, (//) complex formation between a mutant NOVX protein 
and an NOVX ligand; or (Hi) the ability of a mutant NOVX protein to bind to an intracellular 
target protein or biologically-active portion thereof; (e,g, avidin proteins). 

In yet another embodiment, a mutant NOVX protein can be assayed for the ability to 
regulate a specific biological function (e.g., regulation of insulin release). 

Antisense Nucleic Acids 

Another aspect of the invention pertains to isolated antisense nucleic acid molecules 
that are hybridizable to or complementary to the nucleic acid molecule comprising the 
nucleotide sequence of SEQ ID NOS:l, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 
33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77, 79, 81, 
83, 85, 87, 89, 91, 93, 95, 97, 99, 101, 103, 105, 107, 109 and 11 1, or fragments, analogs or 
derivatives thereof. An "antisense" nucleic acid comprises a nucleotide sequence that is 
complementary to a "sense" nucleic acid encoding a protein (e.g., complementary to the 
coding strand of a double-stranded cDNA molecule or complementary to an mRNA 
sequence). In specific aspects, antisense nucleic acid molecules are provided that comprise a 
sequence complementary to at least about 10, 25, 50, 100, 250 or 500 nucleotides or an entire 
NOVX coding strand, or to only a portion thereof. Nucleic acid molecules encoding 
fragments, homologs, derivatives and analogs of an NOVX protein of SEQ ID NOS:2, 4, 6, 
8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 
58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102, 104, 
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1 06, 1 08, 1 1 0, and 1 1 2, or antisense nucleic acids complementary to an NOVX nucleic acid 
sequence of SEQ ID NOSrl, 3, 5, 7, 9, 11,13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 
39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77, 79, 81, 83, 85, 87, 
89, 91, 93, 95, 97, 99, 101, 103, 105, 107, 109 and 111, are additionally provided. 

In one embodiment, an antisense nucleic acid molecule is antisense to a "coding 
region" of the coding strand of a nucleotide sequence encoding an NOVX protein. The term 
"coding region" refers to the region of the nucleotide sequence comprising codons which are 
translated into amino acid residues. In another embodiment, the antisense nucleic acid 
molecule is antisense to a "noncoding region" of the coding strand of a nucleotide sequence 
encoding the NOVX protein. The term "noncoding region" refers to 5' and 3' sequences 
which flank the coding region lhat are not translated into amino acids (/.e., also referred to as 
5' and 3' untranslated regions). 

Given the coding strand sequences encoding the NOVX protein disclosed herein, 
antisense nucleic acids of the invention can be designed according to the rules of Watson and 
Crick or Hoogsteen base pairing. The antisense nucleic acid molecule can be complementary- 
to the entire coding region of NOVX mRNA, but more preferably is an oligonucleotide that is 
antisense to only a portion of the coding or noncoding region of NOVX mRNA. For 
example, the antisense oligonucleotide can be complementary to the region surrounding the 
translation start site of NOVX mRNA. An antisense oligonucleotide can be, for example, 
about 5, 10, 15, 20, 25, 30, 35, 40, 45 or 50 nucleotides in length. An antisense nucleic acid 
of the invention can be constructed using chemical synthesis or enzymatic ligation reactions 
using procedures known in the art. For example, an antisense nucleic acid (e.g., an antisense 
oligonucleotide) can be chemically synthesized using naturally-occurring nucleotides or 
variously modified nucleotides designed to increase the biological stability of the molecules 
or to increase the physical stability of the duplex formed between the antisense and sense 
nucleic acids (e.g., phosphorothioate derivatives and acridine substituted nucleotides can be 
used). 

Examples of modified nucleotides that can be used to generate the antisense nucleic 
acid include: 5-fIuorouracil, 5-bromouracil, 5-chlorouracil, 5-iodouracil, hypoxanthine, 
xanthine, 4-acetylcytosine, 5-(carboxyhydroxyImethyl) uracil, 5-carboxymethyIaminomethyl- 
2-thiouridine, 5-carboxymethylaminomethyluracil, dihydrouracil, beta-D-galactosylqueosine, 
inosine, N6-isopentenyladenine, 1-methylguanine, 1-methylinosine, 2,2-dimethylguanine, 
2-methyladenine, 2-methylguanine, 3-methylcytosine, 5-methylcytosine, N6-adenine, 

7-methylguanine, 5-methylaminomethyluracil, 5-methoxyaminomethyl-2-thiouracil, 
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beta-D-mannosylqueosine, S'-methoxycarboxymethyluracil, S-methoxyuracil, 
2-methylthio-N6-isopentenyladenine, iiracil-5-oxyacetic acid (v), wybutoxosine, 
pseudouracil, queosine, 2-thiocytosine, 5-methyl-2-thiouraciI, 2-thiouracil, 4-thiouracil, 
5-methyluracil, uracil-5-oxyacetic acid methylester, uracil-5-oxyacetic acid (v), 
5-methyl-2-thiouracil, 3-(3-amino-3-N-2-carboxypropyI) uracil, (acp3)w, and 
2,6-diaminopurine. Alternatively, the antisense nucleic acid can be produced biologically 
using an expression vector into which a nucleic acid has been subcloned in an antisense 
orientation (Le,, RNA transcribed from the inserted nucleic acid will be of an antisense 
orientation to a target nucleic acid of interest, described further in the following subsection). 

The antisense nucleic acid molecules of the invention are typically administered to a 
subject or generated in situ such that they hybridize with or bind to cellular mRNA and/or 
genomic DNA encoding an NOVX protein to thereby inhibit expression of the protein {e,g,, 
by inhibiting transcription and/or translation). The hybridization can be by conventional 
nucleotide complementarity to form a stable duplex, or, for example, in the case of an 
antisense nucleic acid molecule that binds to DNA duplexes, through specific interactions in 
the major groove of the double helix. An example of a route of administration of antisense 
nucleic acid molecules of the invention includes direct injection at a tissue site. 
Alternatively, antisense nucleic acid molecules can be modified to target selected cells and 
then administered systemically. For example, for systemic administration, antisense 
molecules can be modified such that they specifically bind to receptors or antigens expressed 
on a selected cell surface (e.g., by linking the antisense nucleic acid molecules to peptides or 
antibodies that bind to cell surface receptors or antigens). The antisense nucleic acid 
molecules can also be delivered to cells using the vectors described herein. To achieve 
sufficient nucleic acid molecules, vector constructs in which the antisense nucleic acid 
molecule is placed under the control of a strong pol II or pol III promoter are preferred. 

In yet another embodiment, the antisense nucleic acid molecule of the invention is an 
a-anomeric nucleic acid molecule. An a-anomeric nucleic acid molecule forms specific 
double-stranded hybrids with complementary RNA in which, contrary to the usual p-units, 
the strands nm parallel to each other. See, e.g., Gaultier, et al, 1987. Nucl Acids Res. 15: 
6625-6641 . The antisense nucleic acid molecule can also comprise a 

2'-o-methylribonucleotide (See, e.g., Inoue, et al 1987. Nucl Acids Res. 15: 6131-6148) or a 
chimeric RNA-DNA analogue {See, e.g., Inoue, etal, 1987. FEES Lett. 215: 327-330. 
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RiBOZYMES AND PNA MOBETIES 

Nucleic acid modifications include, by way of non-limiting example, modified bases, 
and nucleic acids whose sugar phosphate backbones are modified or derivatized. These 
modifications are carried out at least in part to enhance the chemical stability of the modified 
nucleic acid, such that they may be used, for example, as antisense binding nucleic acids in 
therapeutic applications in a subject. 

In one embodiment, an antisense nucleic acid of the invention is a ribozyme. 
Ribozymes are catalytic RNA molecules with ribonuclease activity that are capable of 
cleaving a single-stranded nucleic acid, such as an mRNA, to which they have a 
complementary region. Thus, ribozymes (e.g., hammerhead ribozymes as described in 
Haselhoff and Gerlach 1988. Nature 334: 585-591) can be used to catalytically cleave NOVX 
mRNA transcripts to thereby inhibit translation of NOVX mRNA. A ribozyme having 
specificity for an NOVX-encoding nucleic acid can be designed based upon the nucleotide 
sequence of an NOVX cDNA disclosed herein (/.e., SEQ IDNOS:l,3, 5, 7, 9, 11, 13, 15, 17, 
19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 
69, 71, 73, 75, 77, 79, 81, 83, 85, 87, 89, 91, 93, 95, 97, 99, 101, 103, 105, 107, 109 and 1 1 1). 
For example, a derivative of a Tetrahymena L-19 TVS RNA can be constructed in which the 
nucleotide sequence of the active site is complementary to the nucleotide sequence to be 
cleaved in an NOVX-encoding mRNA. See, e.g., U.S. Patent 4,987,071 to Cech, et al and 
U.S* Patent 5, 11 6,742 to Cech, et al NOVX mRNA can also be used to select a catalytic 
RNA having a specific ribonuclease activity from a pool of RNA molecules. See, e.g., Bartel 
et al, (1993) Science 261:141 1-1418. 

Alternatively, NOVX gene expression can be inhibited by targeting nucleotide 
sequences complementary to the regulatory region of the NOVX nucleic acid (e.g., the 
NOVX promoter and/or enhancers) to form triple helical structures that prevent transcription 
of the NOVX gene in target cells. See, e.g., Helene, 1991. Anticancer Drug Des. 6: 569-84; 
Helene, et al 1992. Ann. N.Y, Acad Sci. 660: 27-36; Maher, 1992, Bioassqys 14: 807-15. 

In various embodiments, the NOVX nucleic acids can be modified at the base moiety, 
sugar moiety or phosphate backbone to improve, e.g., the stability, hybridization, or solubility 
of the molecule. For example, the deoxyribose phosphate backbone of the nucleic acids can 
be modified to generate peptide nucleic acids. See, e.g., Hyrup, et al, 1996. BioorgMed 
Chem 4: 5-23. As used herein, the terms "peptide nucleic acids" or "PNAs" refer to nucleic 
acid munics {e.g., DNA mimics) m which the deoxyribose phosphate backbone is replaced by 
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a pseudopeptide backbone and only the four natural nucleobases are retained. The neutral 
backbone of PNAs has been shown to allow for specific hybridization to DNA and KNA 
under conditions of low ionic strength. The synthesis of PNA oligomers can be performed 
using standard solid phase peptide synthesis protocols as described in Hyrup, et al, 1996. 
supra; Perry-O'Keefe, etal, 1996. Proc, Natl Acad, Set USA 93: 14670-14675. 

PNAs of NOVX can be used in therapeutic and diagnostic applications. For example, 
PNAs can be used as antisense or antigene agents for sequence-specific modulation of gene 
expression by, e.g., inducing transcription or translation arrest or mhibiting replication. 
PNAs of NOVX can also be used, for example, in the analysis of single base pair mutations 
in a gene (e.g., PNA directed PGR clamping; as artificial restriction enzymes when used m 
combination with other enzymes, e.g., Sj nucleases {See, Hyrup, et al, I996^uprd); or as 
probes or primers for DNA sequence and hybridization {See, Hyrup, et aL, 1996, supra; 
Perry-O'Keefe, er a/., 1996. supra). 

In another embodiment, PNAs of NOVX can be modified, e.g., to enhance their 
stability or cellular uptake, by attaching lipophilic or other helper groups to PNA, by the 
formation of PNA-DNA chimeras, or by the use of liposomes or other techniques of drug 
delivery known in the art. For example, PNA-DNA chimeras of NOVX can be generated 
that may combine the advantageous properties of PNA and DNA. Such chimeras allow DNA 
recognition enzymes (e.g., RNase H and DNA polymerases) to interact with the DNA portion 
while the PNA portion would provide high binding affinity and specificity. PNA-DNA 
chimeras can be linked using linkers of appropriate lengths selected in terms of base stacking, 
number of bonds between the nucleobases, and orientation (^ee, Hyrup, et aL, 1996. supra). 
The synthesis of PNA-DNA chimeras can be performed as described in Hyrup, et aL, 1996. 
supra and Finn, et aL, 1996. NucI Acids Res 24: 3357-3363. For example, a DNA chain can 
be synthesized on a solid support using standard phosphoramidite coupling chemistry, and 
modified nucleoside analogs, e.g., 5 -(4-'methoxytrityl)amino-5 -deoxy-thymidine 
phosphoramidite, can be used between the PNA and the 5* end of DNA. See, e.g., Mag, et 
aL, 1989. Nucl Acid Res 17: 5973-5988. PNA monomers are then coupled in a stepwise 
manner to produce a chimeric molecule with a 5' PNA segment and a 3' DNA segment. See, 
e.g., Finn, et aL, 1996. supra. Alternatively, chimeric molecules can be synthesized with a 5' 
DNA segment and a 3* PNA segment. 5ee, e.g., Petersen, et aL, 1975. Bioorg. Med, Chem. 
Lett. 5: 1119-11124. 

In other embodiments, the oligonucleotide may include other appended groups such 

as peptides (e.g., for targeting host cell receptors in vivo), or agents facilitating transport 
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across the cell membrane {see, e.g., Letsinger, et al^ 1989. Proc. NatL Acad Set. U,S,A. 86: 
6553-6556; Lemaitre, et al, 1987. Proc. Natl Acad. Set 84: 648-652; PCT Publication No. 
WO88/09810) or the blood-brain barrier (see, e.g,, PCT Publication No. WO 89/10134). In 
addition, oligonucleotides can be modified with hybridization triggered cleavage agents {see, 
e.g., Krol, et aL, 1988. BioTechniques 6:958-976) or intercalating agents {see, e.g., Zon, 
1988. Pharm. Res. 5: 539-549). To this end, the oligonucleotide may be conjugated to 
another molecule, e.g., a peptide, a hybridization triggered cross-linking agent, a transport 
agent, a hybridization-triggered cleavage agent, and the like. 

NOVX Polypeptides 

A polypeptide according to the invention includes a polypeptide including the amino 
acid sequence of NOVX polypeptides whose sequences are provided in SEQ ID NOS:2, 4, 6, 
8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 
58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102, 104, 
106, 108, 1 10, and 1 12. The invention also includes a mutant or variant protein any of whose 
residues may be changed from the corresponding residues shown in SEQ ID NOS:2, 4, 6, 8, 
10, 12, 14, 16, 1 8, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 
60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102, 104, 
106, 108, 110, and 112 while still encoding a protein that maintains its NOVX activities and 
physiological functions, or a functional fragment thereof. 

In general, an NOVX variant that preserves NOVX-like function includes any variant 
in which residues at a particular position in the sequence have been substituted by other 
amino acids, and further include the possibility of inserting an additional residue or residues 
between two residues of the parent protein as well as the possibility of deleting one or more 
residues fi-om the parent sequence. Any amino acid substitution, insertion, or deletion is 
encompassed by the invention. In favorable circumstances, the substitution is a conservative 
substitution as defined above. 

One aspect of the invention pertains to isolated NOVX proteins, and biologically- 
active portions thereof, or derivatives, fragments, analogs or homologs thereof. Also 
provided are polypeptide fragments suitable for use as immtmogens to raise anti-NOVX 
antibodies. In one embodiment, native NOVX proteins can be isolated from cells or tissue 
sources by an appropriate purification scheme using standard protein purification techniques. 
In another embodiment, NOVX proteins are produced by recombinant DNA techniques. 
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Alternative to recombinant expression, an NOVX protein or polypeptide can be synthesized 
chemically using standard peptide synthesis techniques. 

An "isolated" or "purified" polypeptide or protein or biologically-active portion 
thereof is substantially free of cellular material or other contaminating proteins from the cell 
or tissue source from which the NOVX protein is derived, or substantially free from chemical 
precursors or other chemicals when chemically synthesized. The language "substantially free 
of cellular material" inchides preparations of NOVX proteins in which the protein is 
separated from cellular components of the cells from which it is isolated or recombinantly- 
produced. In one embodiment, the language "substantially free of cellular material" includes 
preparations of NOVX proteins having less than about 30% (by dry weight) of non-NOVX 
proteins (also referred to herein as a "contaminating protein")^ more preferably less than 
about 20% of non-NOVX proteins, still more preferably less than about 10% of non-NOVX 
proteins, and most preferably less than about 5% of non-NOVX proteins. When the NOVX 
protein or biologically-active portion thereof is recombinantly-produced, it is also preferably 
substantially free of culture medium, /.e., culture medium represents less than about 20%, 
more preferably less than about 10%, and most preferably less than about 5% of the volume 
of the NOVX protein preparation. 

The language "substantially free of chemical precursors or other chemicals" includes 
preparations of NOVX proteins in which the protein is separated from chemical precursors or 
other chemicals that are involved in the synthesis of the protein. In one embodiment, the 
language "substantially free of chemical precursors or other chemicals" includes preparations 
of NOVX proteins having less than about 30% (by dry weight) of chemical precursors or 
non-NOVX chemicals, more preferably less than about 20% chemical precursors or 
non-NOVX chemicals, still more preferably less than about 10% chemical precursors or 
non-NOVX chemicals, and most preferably less than about 5% chemical precursors or 
non-NOVX chemicals. 

Biologically-active portions of NOVX proteins include peptides comprising amino 

acid sequences sufficiently homologous to or derived from the amino acid sequences of the 

NOVX proteins (e.g., the amino acid seqmncc shown in SEQ ID NOS:2, 4, 6, 8, 10, 12, 14, 

16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 

66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102, 104, 106, 108, 1 10, 

and 112) that include fewer amino acids than the full-length NOVX proteins, and exhibit at 

least one activity of an NOVX protein. Typically, biologically-active portions comprise a 

domain or motif v^th at least one activity of the NOVX protein. A biologically-active 
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portion of an NOVX protein can be a polypeptide which is, for example, 10, 25, 50, 100 or 
more amino acid residues in length. 

Moreover, other biologically-active portions, in which other regions of the protein are 
deleted, can be prepared by recombinant techniques and evaluated for one or more of the 
functional activities of a native NOVX protein. 

In an embodiment, the NOVX protein has an amino acid sequence shovra SEQ ID 
NOS:2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 
52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 
100, 102, 104, 106, 108, 110, and 112. In other embodiments, the NOVX protein is 
substantially homologous to SEQ ID NOS:2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 
32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 
82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102, 104, 106, 108, 110, and 1 12, and retains the 
functional activity ofthe protein of SEQ ID NOS:2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22,24,26, 
28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 
78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102, 104, 106, 108, 110, and 112, yet differs in 
amino acid sequence due to natural allelic variation or mutagenesis, as described in detail, 
below. Accordingly, in another embodiment, the NOVX protein is a protein that comprises 
an amino acid sequence at least about 45% homologous to the amino acid sequence SEQ ID 
NOS:2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 
52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 
100, 102, 104, 106, 108, 1 10, and 1 12, and retains the functional activity of the NOVX 
proteins of SEQ ID NOS:2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 
40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 
90, 92, 94, 96, 98, 100, 102, 104, 106, 108, 1 10, and 1 12. 

Determitong Homology Between Two or More Sequences 

To determine the percent homology of two amino acid sequences or of two nucleic 
acids, the sequences are aligned for optimal comparison purposes (e.g., gaps can be 
introduced in the sequence of a first amino acid or nucleic acid sequence for optimal 
alignment with a second amino or nucleic acid sequence). The amino acid residues or 
nucleotides at corresponding amino acid positions or nucleotide positions are then compared. 
When a position in the first sequence is occupied by the same amino acid residue or 
nucleotide as the corresponding position in the second sequence, then the molecules are 
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homologous at that position (i.e., as used herein amino acid or nucleic acid "homology" is 
equivalent to amino acid or nucleic acid "identity"). 

The nucleic acid sequence homology may be determined as the degree of identity 
between two sequences. The homology may be determined using computer programs known 
in the art, such as GAP software provided in the GCG program package. See, Needleman 
and Wunsch, 1970. JMolBiol 48: 443-453. Using GCG GAP software with the following 
settings for nucleic acid sequence comparison: GAP creation penalty of 5.0 and GAP 
extension penalty of 0.3, the coding region of the analogous nucleic acid sequences referred 
to above exhibits a degree of identity preferably of at least 70%, 75%, 80%, 85%, 90%, 95%, 
98%, or 99%, with the CDS (encoding) part of the DNA sequence shown in SEQ ID NOS:l, 
3, 5, 7, 9, 1 1, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 
55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77, 79, 81, 83, 85, 87, 89, 91, 93, 95, 97, 99, 101, 
103, 105, 107, 109 and 111. 

The term "sequence identity" refers to the degree to which two polynucleotide or 
polypeptide sequences are identical on a residue-by-residue basis over a particular region of 
comparison. The term "percentage of sequence identity" is calculated by comparing two 
optimally aligned sequences over that region of comparison, determining the number of 
positions at which the identical nucleic acid base (e.g.. A, T, C, G, U, or I, in the case of 
nucleic acids) occurs in both sequences to yield the number of matched positions, dividing 
the number of matched positions by the total number of positions in the region of comparison 
(/.e., the window size), and multiplying the result by 100 to yield the percentage of sequence 
identity. The term "substantial identity" as used herein denotes a characteristic of a 
polynucleotide sequence, wherein the polynucleotide comprises a sequence that has at least 
80 percent sequence identity, preferably at least 85 percent identity and often 90 to 95 percent 
sequence identity, more usually at least 99 percent sequence identity as compared to a 
reference sequence over a comparison region. 

CfflMERic AND Fusion Proteins 

The invention also provides NOVX chimeric or fusion proteins. As used herein, an 
NOVX "chimeric protein" or "fusion protein" comprises an NOVX polypeptide operatively- 
linked to a non-NOVX polypeptide. An "NOVX polypeptide" refers to a polypeptide having 
an amino acid sequence corresponding to an NOVX protein SEQ IDNOS:2, 4, 6, 8, 10, 12, 
14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 
64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102, 104, 106, 108, 
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1 10, and 1 12, whereas a "non-NOVX polypeptide" refers to a polypeptide having an amino 
acid sequence corresponding to a protein that is not substantially homologous to the NOVX 
protein, e.g., a protein that is different from Ae NOVX protein and that is derived from the 
same or a different organism. Within an NOVX fusion protein the NOVX polypeptide can 
correspond to all or a portion of an NOVX protein. In one embodiment, an NOVX fusion 
protein comprises at least one biologically-active portion of an NOVX protein. In another 
embodiment, an NOVX fusion protein comprises at least two biologically-active portions of 
an NOVX protein. In yet another embodiment, an NOVX fusion protem comprises at least 
three biologically-active portions of an NOVX protein. Within the fusion protein, the term 
"operatively-linked" is intended to indicate that the NOVX polypeptide and the non-NOVX 
polypeptide are fused in-frame with one another. The non-NOVX polypeptide can be fused 
to the N-terminus or C-terminus of the NOVX polypeptide. 

In one embodiment, the fusion protein is a GST-NOVX fusion protein in which the 
NOVX sequences are fused to the C-terminus of the GST (glutathione S-transferase) 
sequences. Such fusion proteins can facilitate the purification of recombinant NOVX 
polypeptides. 

In another embodiment, the fusion protein is an NOVX protein containing a 
heterologous signal sequence at its N-terminus. In certain host cells (e.g., mammalian host 
cells), expression and/or secretion of NOVX can be increased through use of a heterologous 
signal sequence. 

In yet another embodiment, the fusion protein is an NOVX-immunoglobulin fusion 
protein in which the NOVX sequences are fused to sequences derived from a member of the 
immunoglobulin protein family. The NOVX-immunoglobulin fusion proteins of the 
invention can be incorporated into pharmaceutical compositions and administered to a subject 
to inhibit an interaction between an NOVX ligand and an NOVX protein on the surface of a 
cell, to thereby suppress NOVX-mediated signal transduction in vivo. The NOVX- 
immunoglobulin fusion proteins can be used to affect the bioavailability of an NOVX 
cognate ligand. Inhibition of the NOVX ligand/NOVX interaction may be useful 
therapeutically for both the treatment of proliferative and differentiative disorders, as well as 
modulating {e.g. promoting or inhibiting) cell survival. Moreover, the 
NOVX-immunoglobulin fusion proteins of the invention can be used as immunogens to 
produce anti-NOVX antibodies in a subject, to purify NOVX ligands, and in screening assays 
to identify molecules that inhibit the interaction of NOVX with an NOVX ligand. 
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An NOVX chimeric or fusion protein of the invention can be produced by standard 
recombinant DNA techniques. For example, DNA fragments coding for the different 
polypeptide sequences are ligated together in-frame in accordance with conventional 
techniques, e.g., by employing blunt-^nded or stagger-ended termini for ligation, restriction 
enzyme digestion to provide for appropriate termini, filling-in of cohesive ends as 
appropriate, alkaline phosphatase treatment to avoid undesirable joining, and enzymatic 
ligation. In another embodiment, the fusion gene can be synthesized by conventional 
techniques including automated DNA synthesizers. Alternatively, PCR amplification of gene 
fragments can be carried out using anchor primers that give rise to complementary overhangs 
between two consecutive gene fragments that can subsequently be annealed and reamplified 
to generate a chimeric gene sequence {see, e,g., Ausubel, et ah (eds.) Current Protocols in 
Molecular Biology, John Wiley & Sons, 1992). Moreover, many expression vectors are 
commercially available that already encode a fusion moiety {e,g., a GST polypeptide). An 
NOVX-encoding nucleic acid can be cloned into such an expression vector such that the 
fusion moiety is linked in-frame to the NOVX protein. 

NOVX Agonists and Antagonists 

The invention also pertains to variants of the NOVX proteins that function as either 
NOVX agonists (i.e., mimetics) or as NOVX antagonists. Variants of the NOVX protein can 
be generated by mutagenesis (e.g:, discrete point mutation or truncation of the NOVX 
protein). An agonist of the NOVX protein can retain substantially the same, or a subset of, 
the biological activities of the naturally occurring form of the NOVX protein. An antagonist 
of the NOVX protein can inhibit one or more of the activities of the naturally occurring form 
of the NOVX protein by, for example, competitively binding to a downstream or upstream 
member of a cellular signaling cascade which includes the NOVX protein. Thus, specific 
biological effects can be elicited by treatment with a variant of limited function. In one 
embodiment, treatment of a subject with a variant having a subset of the biological activities 
of the naturally occurring form of the protein has fewer side effects in a subject relative to 
treatment with the naturally occurring form of the NOVX proteins. 

Variants of the NOVX proteins that function as either NOVX agonists (/.e., mimetics) 
or as NOVX antagonists can be identified by screening combinatorial libraries of mutants 
(e.g., truncation mutants) of the NOVX proteins for NOVX protein agonist or antagonist 
activity. In one embodiment, a variegated library of NOVX variants is generated by 
combinatorial mutagenesis at the nucleic acid level and is encoded by a variegated gene 
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library. A variegated library of NOVX variants can be produced by, for example, 
enzymatically ligating a mixture of synthetic oligonucleotides into gene sequences such that a 
degenerate set of potential NOVX sequences is expressible as individual polypeptides, or 
alternatively, as a set of larger fusion proteins (e.g., for phage display) containing the set of 
NOVX sequences therein. There are a variety of methods w^hich can be used to produce 
libraries of potential NOVX variants 6rom a degenerate oligonucleotide sequence. Chemical 
synthesis of a degenerate gene sequence can be performed m an automatic DNA synthesizer, 
and the synthetic gene then ligated into an appropriate expression vector. Use of a degenerate 
set of genes allows for the provision, in one mixture, of all of the sequences encoding the 
desired set of potential NOVX sequences. Methods for synthesizing degenerate 
oligonucleotides are well-known within the art. See, e.g., Narang, 1983. Tetrahedron 39: 3; 
Itakura, et al, \9U.Anrm. Rev. Biochem. 53: 323; Itakura, et al, 1984. Science 198: 1056; 
Ike, et al, 1983. Nucl Acids Res. 1 1 : 477. 

Polypeptide Libraries 

In addition, libraries of fragments of the NOVX protein coding sequences can be used 
to generate a variegated population of NOVX fragments for screening and subsequent 
selection of variants of an NOVX protein. In one embodiment, a library of coding sequence 
fragments can be generated by treating a double stranded PGR fragment of an NOVX coding 
sequence with a nuclease under conditions wherein nicking occurs only about once per 
molecule, denaturing the double stranded DNA, renaturing the DNA to form double-stranded 
DNA that can include sense/antisense pairs from different nicked products, removing single 
stranded portions from reformed duplexes by treatment with Si nuclease, and ligating the 
resulting fragment library into an expression vector. By this method, expression libraries can 
be derived which encodes N-terminal and internal fragments of various sizes of the NOVX 
proteins. 

Various techniques are known in the art for screening gene products of combinatorial 
libraries made by point mutations or truncation, and for screening cDNA libraries for gene 
products having a selected property. Such techniques are adaptable for rapid screening of the 
gene libraries generated by the combinatorial mutagenesis of NOVX proteins. The most 
widely used techniques, which are amenable to high throughput analysis, for screening large 
gene libraries typically include cloning the gene library into replicable expression vectors, 
transforming appropriate cells with the resulting library of vectors, and expressing the 
combinatorial genes under conditions in which detection of a desired activity facilitates 
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isolation of the vector encoding the gene whose product was detected. Recursive ensemble 
mutagenesis (REM), a new technique that enhances the frequency of functional mutants in 
the libraries, can be used in combination with the screening assays to identify NOVX 
variants. See, e,g,, Arkin and Yourvan, 1992. Proc. Natl Acad ScL USA 89: 781 1-7815; 
Delgrave, et ai, 1993. Protein Engineering 6:327-33 1 . 

Anti-NOVX Antibodies 

Also included in the invention are antibodies to NOVX proteins, or fragments of 
NOVX proteins. The term "antibody" as used herein refers to immunoglobulin molecules 
and immunologically active portions of immunoglobulin (Ig) molecules, /.e., molecules that 
contain an antigen binding site that specifically binds (immunoreacts with) an antigen. Such 
antibodies include, but are not limited to, polyclonal, monoclonal, chimeric, single chain. Fab, 
Fab* and F(ab')2 fragments, and an Fab expression library. In general, an antibody molecule 
obtained from humans relates to any of the classes IgG, IgM, IgA, IgE and IgD, which differ 
from one another by the nature of the heavy chain present in the molecule. Certain classes 
have subclasses as well, such as IgGi, IgG2, and others. Furthermore, in humans, the light 
chain may be a kappa chain or a lambda chain. Reference herein to antibodies includes a 
reference to all such classes, subclasses and types of human antibody species. 

An isolated NOVX-related protein of the invention may be intended to serve as an 
antigen, or a portion or fragment thereof, and additionally can be used as an immxmogen to 
generate antibodies that immunospecifically bind the antigen, using standard techniques for 
polyclonal and monoclonal antibody preparation. The full-length protein can be used or, 
alternatively, the invention provides antigenic peptide fragments of the antigen for use as 
immunogens. An antigenic peptide fragment comprises at least 6 amino acid residues of the 
amino acid sequence of the foil length protein and encompasses an epitope thereof such that 
an antibody raised against the peptide forms a specific immune complex with the full length 
protein or with any fragment that contains the epitope. Preferably, the antigenic peptide 
comprises at least 10 amino acid residues, or at least 15 amino acid residues, or at least 20 
amino acid residues, or at least 30 amino acid residues. Preferred epitopes encompassed by 
the antigenic peptide are regions of the protein that are located on its surface; commonly 
these are hydrophilic regions. 

In certain embodiments of the invention, at least one epitope encompassed by the 
antigenic peptide is a region of NOVX-related protein that is located on the surface of the 
protein, e.g., a hydrophilic region. A hydrophobicity analysis of the human NOVX-related 
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protein sequence will indicate which regions of a NOVX-related protein are particularly 
hydrophilic and, therefore, are likely to encode surface residues useful for targeting antibody 
production. As a means for targeting antibody production, hydropathy plots showing regions 
of hydrophilicity and hydrophobicity may be generated by any method well known in the art, 
including, for example, the Kyte Doolittle or the Hopp Woods methods, either with or 
without Fourier transformation. See, e.g., Hopp and Woods, 1981, Proc, Nat Acad, Set USA 
78: 3824-3828; Kyte and Doolittle 1982, J, Mol Biol 157: 105-142, each of which is 
incorporated herein by reference in its entirety. Antibodies that are specific for one or more 
domains within an antigenic protein, or derivatives, fragments, analogs or homologs thereof, 
are also provided herein. 

A protein of the invention, or a derivative, fragment, analog, homolog or ortholog 
thereof, may be utilized as an immunogen in the generation of antibodies that 
immunospecifically bind these protein components. 

Various procedures known within the art may be used for the production of 
polyclonal or monoclonal antibodies directed against a protein of the invention, or against 
derivatives, fragments, analogs homologs or orthologs thereof (see, for example. Antibodies: 
A Laboratory Manual, Hariow and Lane, 1988, Cold Spring Harbor Laboratory Press, Cold 
Spring Harbor, NY, incorporated herein by reference). Some of these antibodies are 
discussed below. 

POLYCLONAL ANTIBODIES 

For the production of polyclonal antibodies, various suitable host animals (e.g:, rabbit, 
goat, mouse or other mammal) may be immunized by one or more injections with the native 
protein, a synthetic variant tiiereof, or a derivative of the foregoing. An appropriate 
immunogenic preparation can contain, for example, the naturally occurring immunogenic 
protein, a chemically synthesized polypeptide representing the inmiimogenic protein, or a 
recombinantly expressed immunogenic protein. Furthermore, the protein may be conjugated 
to a second protein known to be immunogenic in the mammal bemg immunized. Examples 
of such immunogenic proteins include but are not limited to keyhole limpet hemocyanin, 
serum albumin, bovine thyroglobulin, and soybean trypsin inhibitor. The preparation can 
further include an adjuvant. Various adjuvants used to increase the immunological response 
include, but are not limited to, Freund's (complete and incomplete), mineral gels (e.g., 
aluminum hydroxide), surface active substances (e.g., lysolecithin, pluronic polyols, 
polyanions, peptides, oil emulsions, dinitrophenol, etc.), adjuvants usable in humans such as 
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Bacille Calmette-Guerin and Corynebacterium parvum, or similar immunostimulatory agents. 
Additional examples of adjuvants which can be employed include MPL-TDM adjuvant 
(monophosphoryl Lipid A, synthetic trehalose dicorynomycolate). 

The polyclonal antibody molecules directed against the immunogenic protein can be 
isolated from the mammal (e.g., from the blood) and further purified by well known 
techniques, such as affinity chromatography using protein A or protein G, which provide 
primarily the IgG fraction of immune serum. Subsequently, or alternatively, the specific 
antigen which is the target of the immunoglobulin sought, or an epitope thereof, may be 
immobilized on a column to purify the immune specific antibody by immunoafiRnity 
chromatography. Purification of immunoglobulins is discussed, for example, by D. 
Wilkinson (The Scientist, published by The Scientist, Inc., Philadelphia PA, Vol. 14, No. 8 
(April 17, 2000), pp. 25-28). 

Monoclonal Antibodies 

The term "monoclonal antibody" (MAb) or "monoclonal antibody composition", as 
used herein, refers to a population of antibody molecules that contain only one molecular 
species of antibody molecule consisting of a unique light chain gene product and a unique 
heavy chain gene product. In particular, the complementarity determining regions (CDRs) of 
the monoclonal antibody are identical in all the molecules of the population. MAbs thus 
contain an antigen binding site capable of immunoreacting with a particular epitope of the 
antigen characterized by a unique binding affinity for it. 

Monoclonal antibodies can be prepared using hybridoma methods, such as those 
described by Kohler and Milstein, Nature, 256:495 (1975). In a hybridoma method, a mouse, 
hamster, or other appropriate host animal, is typically immunized with an immunizing agent 
to elicit lymphocytes that produce or are capable of producing antibodies that will 
specifically bind to the immunizing agent. Alternatively, the lymphocytes can be immunized 
in vitro. 

The immunizing agent will typically include the protein antigen, a fragment thereof or 
a fusion protein thereof Generally, either peripheral blood lymphocytes are used if cells of 
human origin are desired, or spleen cells or lymph node cells are used if non-human 
mammalian sources are desired. The lymphocytes are then fused with an immortalized cell 
line using a suitable fusing agent, such as polyethylene glycol, to form a hybridoma cell 
(Goding, MONOCLONAL Antibodies: Principles and Practice, Academic Press, (1986) pp. 
59-103). Immortalized cell lines are usually transformed mammalian cells, particularly 
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myeloma cells of rodent, bovine and human origin. Usually, rat or mouse myeloma cell lines 
are employed. The hybridoma cells can be cultured in a suitable culture medium that 
preferably contains one or more substances that inhibit the growth or survival of the unfused, 
immortalized cells. For example, if the parental cells lack the enzyme hypoxanthine guanine 
phosphoribosyl transferase (HGPRT or HPRT), the culture medium for the hybridomas 
typically will include hypoxanthine, aminopterin, and thymidine ("HAT medium"), which 
substances prevent the growth of HGPRT-deficient cells. 

Preferred immortalized cell lines are those that fuse efficiently, support stable high 
level expression of antibody by the selected antibody-producing cells, and are sensitive to a 
medium such as HAT medium. More preferred immortalized cell lines are murine myeloma 
lines, which can be obtained, for instance, from the Salk Institute Cell Distribution Center, 
San Diego, California and the American Type Culture Collection, Manassas, Virginia. 
Human myeloma and mouse-human heteromyeloma cell lines also have been described for 
the production of human monoclonal antibodies (Kozbor, J. Immunol, 133:3001 (1984); 
Brodeur et al, Moncx:lonal Antibody Production Techniques and Applications, 
Marcel Dekker, Inc., New York, (1987) pp. 51-63). 

The culture medium in which the hybridoma cells are cultured can then be assayed for 
the presence of monoclonal antibodies directed against the antigen. Preferably, the binding 
specificity of monoclonal antibodies produced by the hybridoma cells is determined by 
immunoprecipitation or by an in vitro binding assay, such as radioimmunoassay (RIA) or 
enzyme-linked immunoabsorbent assay (ELISA). Such techniques and assays are known in 
the art. The binding affinity of the monoclonal antibody can, for example, be determined by 
the Scatchard analysis of Munson and Pollard, ^wa/. Biochem.^ 107:220 (1980). Preferably, 
antibodies having a high degree of specificity and a high binding affinity for the target 
antigen are isolated. 

After the desired hybridoma cells are identified, the clones can be subcloned by 
limiting dilution procedures and grown by standard methods. Suitable culture media for this 
purpose include, for example, Dulbecco*s Modified Eagle's Medium and RPMH640 
medium. Alternatively, the hybridoma cells can be grown in vivo as ascites in a mammal. 

The monoclonal antibodies secreted by the subclones can be isolated or purified from 
the culture medium or ascites fluid by conventional immunoglobulin purification procedures 
such as, for example, protein A-Sepharose, hydroxylapatite chromatography, gel 
electrophoresis, dialysis, or affinity chromatography. 
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The monoclonal antibodies can also be made by recombinant DNA methods, such as 
those described in U-S. Patent No. 4,81 6,567. DNA encoding the monoclonal antibodies of 
the invention can be readily isolated and sequenced using conventional procedures (e.g., by 
using oligonucleotide probes that are capable of binding specifically to genes encoding the 
heavy and light chains of murine antibodies). The hybridoma cells of the invention serve as a 
preferred source of such DNA. Once isolated, the DNA can be placed into expression 
vectors, which are then transfected into host cells such as simian COS cells, Chinese hamster 
ovary (CHO) cells, or myeloma cells that do not otherwise produce immunoglobulin protein, 
to obtain the synthesis of monoclonal antibodies in the recombinant host cells. The DNA 
also can be modified, for example, by substituting the coding sequence for human heavy and 
light chain constant domains in place of the homologous murine sequences (U.S. Patent No. 
4,816,567; Morrison, Nature 368, 812-13 (1994)) or by covalently joining to the 
immunoglobulin coding sequence all or part of the coding sequence for a non- 
immunoglobulin polypeptide. Such a non-immunoglobulin polypeptide can be substituted 
for the constant domains of an antibody of the invention, or can be substituted for the variable 
domains of one antigen-combining site of an antibody of the invention to create a chimeric 
bivalent antibody. 

Humanized Antibodies 

The antibodies directed against the protein antigens of the invention can further 
comprise humanized antibodies or human antibodies. These antibodies are suitable for 
administration to humans without engendering an immune response by the human against the 
administered immunoglobulin. Humanized forms of antibodies are chimeric 
immunoglobulins, immunoglobulin chains or fragments thereof (such as Fv, Fab, Fab', F(ab')2 
or other antigen-binding subsequences of antibodies) that are principally comprised of the 
sequence of a human immunoglobulin, and contain minimal sequence derived from a non- 
human immunoglobulin. Humanization can be performed following the method of Winter 
and co-workers (Jones et a/.. Nature^ 321:522-525 (1986); Riechmann et al.^ Nature, 
332'32i'yn (1988); Verhoeyen et al. Science, 239:1534-1536 (1988)), by substituting 
rodent CDRs or CDR sequences for the corresponding sequences of a human antibody^ (See 
also U.S. Patent No. 5,225,539.) In some instances, Fv framework residues of the human 
immunoglobulin are replaced by corresponding non*human residues. Humanized antibodies 
can also comprise residues which are found neither in the recipient antibody nor in the 
imported CDR or framework sequences. In general, the humanized antibody will comprise 
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substantially all of at least one, and typically two, variable domains, in which all or 
substantially all of the CDR regions correspond to those of a non-human immunoglobulin 
and all or substantially all of the framework regions are those of a human immunoglobulin 
consensus sequence. The humanized antibody optimally also will comprise at least a portion 
of an immunoglobulin constant region (Fc), typically that of a human immunoglobulin (Jones 
etal, 1986; Riechmann etal, 1988; and Presta, Curr, Op, Struct Bial, 2:593-596 (1992)). 

Human Antibodies 

Fully human antibodies relate to antibody molecules in which essentially the entire 
sequences of both the light chain and the heavy chain, including the CDRs, arise from human 
genes. Such antibodies are termed "human antibodies", or "fully human antibodies" herein. 
Human monoclonal antibodies can be prepared by the trioma technique; the human B-cell 
hybridoma technique (see Kozbor, et al.y 1983 Immunol Today 4: 72) and the EBV 
hybridoma technique to produce human monoclonal antibodies (see Cole, et ah, 1985 In: 
Monoclonal Antibodies AND Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). Human 
monoclonal antibodies may be utilized in the practice of the present invention and may be 
produced by using human hybridomas (see Cote, et al, 1983. Proc Natl Acad Sci USA 80: 
2026-2030) or by transforming human B-cells with Epstein Barr Virus in vitro (see Cole, et 
ah, 1985 In: MONOCLONAL Antibodies AND Cancer Therapy, Alan R. Liss, Inc., pp. 
77-96). 

In addition, human antibodies can also be produced using additional techniques, 
including phage display libraries (Hoogenboom and Winter, J. Mol Biol, 227:381 (1991); 
Marks et cd., J. Mol Biol, 222:581 (1991)). Similarly, human antibodies can be made by 
introducing human immunoglobulin loci into transgenic animals, e,g., mice in which the 
endogenous immunoglobulin genes have been partially or completely inactivated. Upon 
challenge, human antibody production is observed, which closely resembles that seen in 
humans in all respects, including gene rearrangement, assembly, and antibody repertoire. 
This approach is described, for example, in U.S. Patent Nos. 5,545,807; 5,545,806; 
5,569,825; 5,625,126; 5,633,425; 5,661,016, and in Marks et al {Bio/Technology 10, 779- 
783 (1992)); honh^rgetal {Nature 368 856-859 (1994)); Momson{Nature 368, 812-13 

(1 994) ); Fishwild et al,( Nature Biotechnology 1 4, 845-5 1 (1 996)); Neuberger {^ature 
Biotechnology 14, 826 (1996)); and Lonberg and Huszar {InterrL Rev. Immunol 13 65-93 

(1995) ). 
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Human antibodies may additionally be produced using transgenic nonhuman animals 
which are modified so as to produce fully human antibodies rather than the animaPs 
endogenous antibodies in response to challenge by an antigen. (See PCT publication 
WO94/02602). The endogenous genes encoding the heavy and light immunoglobulin chains 
in the nonhuman host have been incapacitated, and active loci encoding human heavy and 
light chain immunoglobulins are inserted into the host's genome. The human genes are 
incorporated, for example, using yeast artificial chromosomes containing the requisite human 
DNA segments. An animal which provides all the desired modifications is then obtained as 
progeny by crossbreeding intermediate transgenic animals containing fewer than the full 
complement of the modifications. The preferred embodiment of such a nonhuman animal is 
a mouse, and is termed the Xenomouse™ as disclosed in PCT publications WO 96/33735 
and WO 96/34096. This animal produces B cells which secrete fully human 
immunoglobulins. The antibodies can be obtained directly from the animal after 
immunization with an immunogen of interest, as, for example, a preparation of a polyclonal 
antibody, or alternatively from immortalized B cells derived from the animal, such as 
hybridomas producing monoclonal antibodies. Additionally, the genes encoding the 
immunoglobulins with human variable regions can be recovered and expressed to obtain the 
antibodies directly, or can be further modified to obtain analogs of antibodies such as, for 
example, single chain Fv molecules. 

An example of a method of producing a nonhuman host, exemplified as a mouse, 
lacking expression of an endogenous immunoglobulin heavy chain is disclosed in U.S. Patent 
No. 5,939,598. It can be obtained by a method including deleting the J segment genes from 
at least one endogenous heavy chain locus in an embryonic stem cell to prevent 
rearrangement of the locus and to prevent formation of a transcript of a rearranged 
immunoglobulin heavy chain locus, the deletion being effected by a targeting vector 
containing a gene encoding a selectable marker; and producing from the embryonic stem cell 
a transgenic mouse whose somatic and germ cells contain the gene encoding the selectable 
marker. 

A method for producing an antibody of interest, such as a human antibody, is 

disclosed in U.S. Patent No. 5,916,771 . It includes introducing an expression vector that 

contains a nucleotide sequence encoding a heavy chain into one mammalian host cell in 

culture, introducing an expression vector containing a nucleotide sequence encoding a light 

chain into another mammalian host cell, and fusing the two cells to form a hybrid cell. The 

hybrid cell expresses an antibody containing the heavy chain and the light chain. 
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In a further improvement on this procedure, a method for identifying a clinically 
relevant epitope on an immunogen, and a correlative metiiod for selecting an antibody that 
binds immunospecifically to the relevant epitope with high affinity, are disclosed in PCX 
publication WO 99/53049. 

Fab Fragments and Single Chain Antibodies 

According to the invention, techniques can be adapted for the production of 
single-chain antibodies specific to an antigenic protein of the invention (see e.g., U.S. Patent 
No. 4,946,778). In addition, methods can be adapted for the construction of Fab expression 
libraries (see e.g., Huse, et al, 1989 Science 246: 1275-1281) to allow rapid and effective 
identification of monoclonal Fab fragments with the desired specificity for a protein or 
derivatives, fragments, analogs or homologs thereof. Antibody fragments that contain the 
idiotypes to a protein antigen may be produced by techniques known in the art including, but 
not limited to: (i) an F(ab')2 fragment produced by pepsin digestion of an antibody molecule; 
(ii) an Fab fragment generated by reducing the disulfide bridges of an F(ab')2 fragment; (iii) an 
Fab fragment generated by the treatment of the antibody molecule with papain and a reducing 
agent and (iv) Fv fragments. 

BisPECiFic Antibodies 

Bispecific antibodies are monoclonal, preferably human or humanized, antibodies that 
have binding specificities for at least two different antigens. In the present case, one of the 
binding specificities is for an antigenic protein of the invention. The second binding target is 
any otfier antigen, and advantageously is a cell-surface protein or receptor or receptor 
subunit. 

Methods for making bispecific antibodies are known in the art. Traditionally, the 
recombinant production of bispecific antibodies is based on the co-expression of two 
immunoglobulin heavy-chain/light-chain pairs, where the two heavy chains have different 
specificities (Milstein and Cuello, Nature, 305:537-539 (1 983)). Because of the random 
assortment of immunoglobulin heavy and light chains, these hybridomas (quadromas) 
produce a potential mixture often different antibody molecules, of which only one has the 
correct bispecific structure. The pxirification of the correct molecule is usually accomplished 
by affinity chromatography steps. Similar procedures are disclosed in WO 93/08829, 
published 13 May 1993, and in Traunecker et al, 1991 EMBOJ., 10:3655-3659. 
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Antibody variable domains with the desired binding specificities (antibody-antigen 
combining sites) can be fused to hnmunoglobulin constant domain sequences. The fusion 
preferably is with an immunoglobulin heavy-chain constant domain, comprising at least part 
of the hinge, CH2, and CHS regions. It is preferred to have the first heavy-chain constant 
region (CHI) containing the site necessary for light-chain binding present in at least one of 
the fusions. DNAs encoding the immunoglobulin heavy-chain fusions and, if desired, the 
immunoglobulin light chain, are inserted into separate expression vectors, and are co- 
transfected into a suitable host organism. For further details of generating bispecific 
antibodies see, for example, Suresh et al. Methods in Enzymology, 121:210 (1986). 

According to another approach described in WO 96/2701 1, the interface between a 
pair of antibody molecules can be engineered to maximize the percentage of heterodimers 
which are recovered from recombinant cell culture. The preferred interface comprises at least 
a part of the CHS region of an antibody constant domain. In this method, one or more small 
amino acid side chains from the interface of the first antibody molecule are replaced with 
larger side chains (e.g. tyrosine or tryptophan). Compensatory "cavities" of identical or 
similar size to the large side chain(s) are created on the interface of the second antibody 
molecule by replacing large amino acid side chains with smaller ones (e.g, alanine or 
threonine). This provides a mechanism for increasing the yield of the heterodimer over other 
unwanted end-products such as homodimers. 

Bispecific antibodies can be prepared as full length antibodies or antibody fragments 
(e.g. F(ab')2 bispecific antibodies). Techniques for generating bispecific antibodies from 
antibody fragments have been described in the literature. For example, bispecific antibodies 
can be prepared using chemical linkage. Brennan et al. Science 229:81 (1985) describe a 
procedure wherein intact antibodies are proteolytically cleaved to generate F(ab')2 fragments. 
These fragments are reduced in the presence of the dithiol complexing agent sodium arsenite 
to stabilize vicinal dithiols and prevent intermolecular disulfide formation. The Fab' 
fragments generated are then converted to thionitrobenzoate (TNB) derivatives. One of the 
Fab'-TNB derivatives is then reconverted to the Fab' -thiol by reduction with 
mercaptoethylamine and is mixed with an equimolar amount of the other Fab'-TNB 
derivative to form the bispecific antibody. The bispecific antibodies produced can be used as 
agents for the selective immobilization of enzymes. 

Additionally, Fab' fragments can be directly recovered from E. coli and chemically 
coupled to form bispecific antibodies. Shalaby et al, J. Exp. Med, 175:217-225 (1992) 
describe the production of a fully humanized bispecific antibody F(ab')2 molecule. Each 
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Fab' fragment was separately secreted from E. coli and subjected to directed chemical 
coupling in vitro to form the bispecific antibody. The bispecific antibody thus formed was 
able to bind to cells overexpressing the ErbB2 receptor and normal human T cells, as well as 
trigger the lytic activity of human cytotoxic lymphocytes against human breast tumor targets. 

Various techniques for making and isolating bispecific antibody fragments directly 
from recombinant cell culture have also been described. For example, bispecific antibodies 
have been produced using leucine zippers. Kostelny et aL, J, Immunol. 148(5): 1547- 1553 
(1992). The leucine zipper peptides from the Fos and Jun proteins were linked to the Fab' 
portions of two different antibodies by gene fiision. The antibody homodimers were reduced 
at the hinge region to form monomers and then re-oxidized to form the antibody 
heterodimers. This method can also be utilized for the production of antibody homodimers. 
The 'Miabody" technology described by Hollinger et al, Proc. Natl Acad. Set USA 
90:6444-6448 (1993) has provided an alternative mechanism for making bispecific antibody 
fragments. The fragments comprise a heavy-chain variable domain (Vh) connected to a 
light-chain variable domain (Vl) by a linker which is too short to allow pairing between the 
two domains on the same chain. Accordingly, the Vh and Vl domains of one fragment are 
forced to pair with the complementary Vl and Vh domains of another fragment, thereby 
forming two antigen-binding sites. Another strategy for making bispecific antibody 
fragments by the use of single-chain Fv (sFv) dimers has also been reported. See, Gruber et 
al, J, Immunol 152:5368 (1994). 

Antibodies with more than two valencies are contemplated. For example, trispecific 
antibodies can be prepared. Tutt et al, J, Immunol 147:60 (1991). 

Exemplary bispecific antibodies can bind to two different epitopes, at least one of 
which originates in the protein antigen of the invention. Alternatively, an anti-antigenic arm 
of an immunoglobulin molecule can be combined with an arm which binds to a triggering 
molecule on a leukocyte such as a T-cell receptor molecule {e.g. CD2, CD3, CD28, or B7), or 
Fc receptors for IgG (FcyR), such as FcyRI (CD64), FcyRII (CD32) and FcyRIII (CD16) so 
as to focus cellular defense mechanisms to the cell expressing the particular antigen. 
Bispecific antibodies can also be used to direct cytotoxic agents to cells which express a 
particular antigen. These antibodies possess an antigen-binding arm and an arm which binds 
a cytotoxic agent or a radionuclide chelator, such as EOTUBE, DPTA, DOTA, or TETA. 
Another bispecific antibody of interest binds the protein antigen described herein and further 
binds tissue factor (TF). 
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Heteroconjugate Antibodies 

Heteroconjugate antibodies are also within the scope of the present invention. 
Heteroconjugate antibodies are composed of two covalently joined antibodies. Such 
antibodies have, for example, been proposed to target immune system cells to unwanted cells 
(U.S. Patent No. 4,676,980), and for treatment of HIV infection (WO 91/00360; WO 
92/200373; EP 03089), It is contemplated that the antibodies can be prepared in vitro using 
known methods in synthetic protein chemistry, including those involving crosslinking agents. 
For example, immunotoxins can be constructed using a disulfide exchange reaction or by 
forming a thioether bond. Examples of suitable reagents for this purpose include 
iminothiolate and methyl-4-mercaptobutyrimidate and those disclosed, for example, in U.S. 
Patent No. 4,676,980. 

Effector Function Engineering 

It can be desirable to modify the antibody of the invention with respect to effector 
function, so as to enhance, e.g., the effectiveness of the antibody in treating cancer. For 
example, cysteine residue(s) can be introduced into the Fc region, thereby allowing interchain 
disulfide bond formation in this region. The homodimeric antibody thus generated can have 
improved internalization capability and/or increased complement-mediated cell killing and 
antibodyKiependent cellular cytotoxicity (ADCC). See Caron et al, J. Exp Med., 1 76: 1 191- 
1195 (1992) and Shopes, J. Immunol., 148: 2918-2922 (1992). Homodimeric antibodies with 
enhanced anti-tumor activity can also be prepared using heterobifiinctional cross-linkers as 
described in Wolff e/ al Cancer Research, 53: 2560-2565 (1993). Alternatively, an antibody 
can be engineered that has dual Fc regions and can thereby have enhanced complement lysis 
and ADCC capabilities. See Stevenson et al, Anti-Cancer Drug Design, 3: 219-230 (1989). 

IMMUNOCONJUGATES 

The invention also pertains to immunoconjugates comprising an antibody conjugated 
to a cytotoxic agent such as a chemotherapeutic agent, toxin (e.g., an enzymatically active 
toxin of bacterial, fungal, plant, or animal origin, or fragments thereof), or a radioactive 
isotope (/.e., a radioconjugate). 

Chemotherapeutic agents useful in the generation of such immunoconjugates have 
been described above. Enzymatically active toxins and fragments thereof that can be used 
include diphtheria A chain, nonbinding active fragments of diphtheria toxin, exotoxin A 
chain (from Pseudomonas aeruginosa), ricin A chain, abrin A chain, modeccin A chain, 
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alpha-sarcin, Aleurites fordii proteins, diantihin proteins, Phytolaca americana proteins (PAPI, 
PAPII, and PAP-S), momordica charantia inhibitor, curcin, crotin, sapaonaria officinalis 
inhibitor, gelonin, mitogellin, restrictocin, phenomycin, enomycin, and the tricothecenes. A 
variety of radionuclides are available for the production of radioconjugated antibodies. 
Examples include ^^^Bi, ^^^I, ''^In, ^Y, and ^«^e. 

Conjugates of the antibody and cytotoxic agent are made using a variety of 
bifunctional protein-coupling agents such as N-succinimidyl-3-(2-pyridyldithiol) propionate 
(SPDP), iminothiolane (IT), bifunctional derivatives of imidoesters (such as dimethyl 
adipimidate HCL), active esters (such as disuccinimidyl suberate), aldehydes (such as 
glutareldehyde), bis-azido compounds (such as bis (p-azidobenzoyl) hexanediamine), bis- 
diazonium derivatives (such as bis-(p-diazoniumbenzoyl)-ethylenediamine), diisocyanates 
(such as tolyene 2,6-diisocyanate), and bis-active fluorine compounds (such as 1,5-difluoro- 
2,4-dinitrobenzene). For example, a ricin immunotoxin can be prepared as described in 
Vitetta e/ a/.. Science, 238: 1098 (1987). Carbon-14-labeled l-isothiocyanatobenzyl-3- 
methyldiethylene triaminepentaacetic acid (MX-DTPA) is an exemplary chelating agent for 
conjugation of radionucleotide to the antibody. See W094/1 1026. 

In another embodiment, the antibody can be conjugated to a "receptor" (such 
streptavidin) for utilization in tumor pretargeting wherein the antibody-receptor conjugate is 
administered to the patient, followed by removal of unbound conjugate from the circulation 
using a clearing agent and then administration of a "ligand" (e.g., avidin) that is in turn 
conjugated to a cytotoxic agent. 

In one embodiment, methods for the screening of antibodies that possess the desired 
specificity include, but are not limited to, enzyme-linked immunosorbent assay (ELISA) and 
other immunologically-mediated techniques known within the art. In a specific embodiment, 
selection of antibodies that are specific to a particular domain of an NOVX protein is 
facilitated by generation of hybridomas that bind to the fragment of an NOVX protein 
possessing such a domain. Thus, antibodies that are specific for a desired domain within an 
NOVX protein, or derivatives, fragments, analogs or homologs thereof, are also provided 
herein. 

Anti-NOVX antibodies may be used in methods known within the art relating to the 
localization and/or quantitation of an NOVX protein (e.g., for use in measuring levels of the 
NOVX protein within appropriate physiological samples, for use in diagnostic methods, for 
use in imaging the protein, and the like). In a given embodiment, antibodies for NOVX 
proteins, or derivatives, fragments, analogs or homologs thereof, that contain the antibody 
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derived binding domain, are utilized as pharmacologically-active compounds (hereinafter 
"Therapeutics"). 

An anti-NOVX antibody (e.g., monoclonal antibody) can be used to isolate an NOVX 
polypeptide by standard techniques, such as affinity chromatography or immunoprecipitation. 
An anti-NOVX antibody can facilitate the purification of natural NOVX polypeptide from 
cells and of recombinantly-produced NOVX polypeptide expressed in host cells. Moreover, 
an anti-NOVX antibody can be used to detect NOVX protein {e,g,, in a cellular lysate or cell 
supernatant) in order to evaluate the abundance and pattern of expression of the NOVX 
protein. Anti-NOVX antibodies can be used diagnostically to monitor protein levels in tissue 
as part of a clinical testing procedure, e.g., to, for example, determine the efficacy of a given 
treatment regimen. Detection can be facilitated by coupling (/.e., physically linking) the 
antibody to a detectable substance. Examples of detectable substances include various 
enzymes, prosthetic groups, fluorescent materials, luminescent materials, bioluminescent 
materials, and radioactive materials. Examples of suitable enzymes include horseradish 
peroxidase, alkaline phosphatase, p-galactosidase, or acetylcholinesterase; examples of 
suitable prosthetic group complexes include streptavidin/biotin and avidin/biotin; examples 
of suitable fluorescent materials include umbelliferone, fluorescein, fluorescein 
isothiocyanate, rhodamine, dichlorotriazinylamine fluorescein, dansyl chloride or 
phycoerythrin; an example of a luminescent material includes luminol; examples of 
bioluminescent materials include luciferase, luciferin, and aequorin, and examples of suitable 
radioactive material include ^^S or ^H. 

NOVX Recombinant Expression Vectors and Host Cells 

Another aspect of the invention pertains to vectors, preferably expression vectors, 
containing a nucleic acid encoding an NOVX protein, or derivatives, fragments, analogs or 
homologs thereof As used herein, the term "vector" refers to a nucleic acid molecule capable 
of transporting another nucleic acid to which it has been linked. One type of vector is a 
"plasmid", which refers to a circular double stranded DNA loop into which additional DNA 
segments can be ligated. Another type of vector is a viral vector, wherein additional DNA 
segments can be ligated into the viral genome. Certain vectors are capable of autonomous 
replication in a host cell into which they are introduced (e.g., bacterial vectors having a 
bacterial origin of replication and episomal mammalian vectors). Other vectors (e.g., 
non-episomal mammalian vectors) are integrated into the genome of a host cell upon 
introduction into the host cell, and thereby are replicated along with the host genome. 
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Moreover, certain vectors are capable of directing the expression of genes to which they are 
operatively-linked. Such vectors are referred to herein as "expression vectors". In general, 
expression vectors of utility in recombinant DNA techniques are often in the form of 
plasmids. In the present specification, "plasmid" and "vector" can be used interchangeably as 
the plasmid is the most commonly used form of vector. However, the invention is intended 
to include such other forms of expression vectors, such as viral vectors {e.g., replication 
defective retroviruses, adenoviruses and adeno-associated viruses), which serve equivalent 
functions. 

The recombinant expression vectors of the invention comprise a nucleic acid of the 
invention in a form suitable for expression of the nucleic acid in a host cell, which means that 
the recombinant expression vectors include one or more regulatory sequences, selected on the 
basis of the host cells to be used for expression, that is operatively-linked to the nucleic acid 
sequence to be expressed. Within a recombinant expression vector, "operably-linked" is 
intended to mean that the nucleotide sequence of interest is linked to the regulatory 
sequence(s) in a manner that allows for expression of the nucleotide sequence (e.g., in an in 
vitro transcription/translation system or in a host cell when the vector is introduced into the 
host cell). 

The term "regulatory sequence" is intended to includes promoters, enhancers and 
other expression control elements {e.g., polyadenylation signals). Such regulatory sequences 
are described, for example, in Goeddel, GENE EXPRESSION TECHNOLOGY: METHODS IN 
Enzymology 185, Academic Press, San Diego, Calif (1990). Regulatory sequences include 
those that direct constitutive expression of a nucleotide sequence in many types of host cell 
and those that direct expression of the nucleotide sequence only in certain host cells {e.g, 
tissue-specific regulatory sequences). It will be appreciated by those skilled in the art that the 
design of the expression vector can depend on such factors as the choice of the host cell to be 
transformed, the level of expression of protein desired, etc. The expression vectors of the 
invention can be introduced into host cells to thereby produce proteins or peptides, including 
fusion proteins or peptides, encoded by nucleic acids as described herein {e.g., NOVX 
proteins, mutant forms of NOVX proteins, fusion proteins, etc.). 

The recombinant expression vectors of the invention can be designed for expression 
of NOVX proteins in prokaryotic or eukaryotic cells. For example, NOVX proteins can be 
expressed in bacterial cells such as Escherichia coli, insect cells (using baculovirus 
expression vectors) yeast cells or mammalian cells. Suitable host cells are discussed further in 
Goeddel, Gene Expression Technology: Methods in Enzymology 185, Academic Press, 
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San Diego, Calif. (1990). Alternatively, the recombinant expression vector can be 
transcribed and translated in vitro, for example using T7 promoter regulatory sequences and 
T7 polymerase. 

Expression of proteins in prokaryotes is most often carried out in Escherichia coli 
with vectors containing constitutive or inducible promoters directing the expression of either 
fusion or non-fusion proteins. Fusion vectors add a number of amino acids to a protein 
encoded therein, usually to the amino terminus of the recombinant protein. Such fusion 
vectors typically serve three purposes: (0 to increase expression of recombinant protein; (//) 
to increase the solubility of the recombinant protein; and (///) to aid in the purification of the 
recombinant protein by acting as a ligand in affinity purification. Often, in fiision expression 
vectors, a proteolytic cleavage site is introduced at the junction of the fusion moiety and the 
recombinant protein to enable separation of the recombinant protein from the fiision moiety 
subsequent to purification of the fusion protein. Such enzymes, and their cognate recognhion 
sequences, include Factor Xa, thrombin and enterokinase. Typical fusion expression vectors 
include pGEX (Pharmacia Biotech Inc; Smith and Johnson, 1988. Gene 67: 31-40), pMAL 
(New England Biolabs, Beverly, Mass.) and pRITS (Pharmacia, Piscataway, N.J.) that fuse 
glutathione S-transferase (GST), mahose E binding protein, or protein A, respectively, to the 
target recombinant protein. 

Examples of suitable inducible non-fusion E. coli expression vectors include pTrc 
(Amrannera/., (1988) Gene 69:301-315) and pET lid (Studiere^ a/., GENE EXPRESSION 
Technology: Methods in Enzymology 185, Academic Press, San Diego, Calif (1990) 
60-89). 

One strategy to maximize recombinant protein expression in E. coli is to express the 
protein in a host bacteria with an impaired capacity to proteolytically cleave the recombinant 
protein. See, e,g,, Gottesman, Gene Expression Technology: Methods in Enzymology 
185, Academic Press, San Diego, Calif (1990) 1 19-128. Another strategy is to alter the 
nucleic acid sequence of the nucleic acid to be inserted into an expression vector so that the 
individual codons for each amino acid are those preferentially utilized in E. coli {see, e.g., 
Wada, et al, 1 992. NucL Acids Res, 20: 2 1 1 1 -2 1 1 8). Such alteration of nucleic acid 
sequences of the invention can be carried out by standard DNA synthesis techniques. 

In another embodiment, the NOVX expression vector is a yeast expression vector. 
Examples of vectors for expression in yeast Saccharomyces cerivisae include pYepSecl 
(Baldari, et al, 1987. EMBOJ, 6: 229-234), pMFa (Kurjan and Herskowitz, 1982. Cell 30: 
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933-943), pJRY88 (Schultz et al, 1987. Gene 54: 1 13-123), pYES2 (Invitrogen Corporation, 
San Diego, Calif.), and picZ (InVitrogen Corp, San Diego, Calif.)- 

Altematively, NOVX can be expressed in insect cells using baculovirus expression 
vectors. Baculovirus vectors available for expression of proteins in cultured insect cells (e.g., 
SF9 cells) include the pAc series (Smith, et al, 1983. Mol Cell Biol 3: 2156-2165) and the 
pVL series (Lucklow and Summers, 1989. Virology 170: 31-39). 

In yet another embodiment, a nucleic acid of the invention is expressed in mammalian 
cells using a mammalian expression vector. Examples of mammalian expression vectors 
include pCDM8 (Seed, 1987. Nature 329: 840) and pMT2PC (Kaufman, et al, 1987. EMBO 
J. 6: 187-195). When used in mammalian cells, the expression vector's control functions are 
often provided by viral regulatory elements. For example, commonly used promoters are 
derived from polyoma, adenovirus 2, cytomegalovirus, and simian virus 40. For other 
suitable expression systems for both prokaryotic and eukaryotic cells see, e.g.. Chapters 16 
and 17 of Sambrook, et aL, MOLECULAR CLONING: A LABORATORY MANUAL. 2nd ed.. Cold 
Spring Harbor Laboratory, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., 
1989. 

In another embodiment, the recombinant mammalian expression vector is capable of 
directing expression of the nucleic acid preferentially in a particular cell type (e.g., 
tissue-specific regulatory elements are used to express the nucleic acid). Tissue-specific 
regulatory elements are known in the art. Non-limiting examples of suitable tissue-specific 
promoters include the albumin promoter (liver-specific; Pinkert, et al, 1987. Genes Dev. 1 : 
268-277), lymphoid-specific promoters (Calame and Eaton, \9%%,Adv. Immunol 43: 
235-275), in particular promoters of T cell receptors (Winoto and Baltimore, 1989. EMBO J, 
8: 729-733) and immunoglobulins (Banerji, et al, 1983. Cell 33: 729-740; Queen and 
Baltimore, 1983. Cell 33: 741-748), neuron-specific promoters {e.g., the neurofilament 
promoter; Byrne and Ruddle, 1989. Proc. Natl Acad Sci. USA 86: 5473-5477), 
pancreas-specific promoters (Edlund, et al, 1985. Science 230: 912-916), and mammary 
gland-specific promoters {e.g., milk whey promoter; U.S. Pat. No. 4,873,316 and European 
Application Publication No. 264,1 66). Developmentally-regulated promoters are also 
encompassed, e.g., the murine hox promoters (Kessel and Gruss, 1990. Science 249: 
374-379) and the a-fetoprotein promoter (Campes and Tilghman, 1 989. Genes Dev. 3: 
537-546). 

The invention further provides a recombinant expression vector comprising a DNA 

molecule of the invention cloned into the expression vector in an antisense orientation. That 
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is, the DNA molecule is operatively-linked to a regulatory sequence in a manner that allows 
for expression (by transcription of the DNA molecule) of an RNA molecule that is antisense 
to NOVX mRNA. Regulatory sequences operatively linked to a nucleic acid cloned in the 
antisense orientation can be chosen that direct the continuous expression of the antisense 
RNA molecule in a variety of cell types, for instance viral promoters and/or enhancers, or 
regulatory sequences can be chosen that direct constitutive, tissue specific or cell type 
specific expression of antisense RNA. The antisense expression vector can be in the form of 
a recombinant plasmid, phagemid or attenuated virus in which antisense nucleic acids are 
produced under the control of a high efficiency regulatory region, the activity of which can be 
determined by the cell type into which the vector is introduced. For a discussion of the 
regulation of gene expression using antisense genes see, e.g., Weintraub, et al, "Antisense 
RNA as a molecular tool for genetic analysis," Reviews-Trends in Genetics, Vol. 1(1) 1986. 

Another aspect of the invention pertains to host cells into which a recombinant 
expression vector of the invention has been introduced. The terms "host cell" and 
"recombinant host cell" are used interchangeably herein. It is understood that such terms 
refer not only to the particular subject cell but also to the progeny or potential progeny of 
such a cell. Because certain modifications may occur in succeeding generations due to either 
mutation or environmental influences, such progeny may not, in fact, be identical to the 
parent cell, but are still included within the scope of the term as used herein. 

A host cell can be any prokaryotic or eukaryotic cell. For example, NOVX protein 
can be expressed in bacterial cells such as £. coli, insect cells, yeast or mammalian cells 
(such as Chinese hamster ovary cells (CHO) or COS cells). Other suitable host cells are 
known to those skilled in the art. 

Vector DNA can be introduced into prokaryotic or eukaryotic cells via conventional 
transformation or transfection techniques. As used herein, the terms "transformation" and 
"transfection" are intended to refer to a variety of art-recognized techniques for introducing 
foreign nucleic acid (e.g., DNA) into a host cell, including calcium phosphate or calcium 
chloride co-precipitation, DEAE-dextran-mediated transfection, lipofection, or 
electroporation. Suitable methods for transforming or transfecting host cells can be found in 
Sambrook, et al (MOLECULAR CLONING: A LABORATORY Manual. 2nd ed.. Cold Spring 
Harbor Laboratory, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., 1989), 
and other laboratory manuals. 

For stable transfection of mammalian cells, it is known that, depending upon the 
expression vector and transfection technique used, only a small fraction of cells may integrate 
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the foreign DNA into their genome. In order to identify and select these integrants, a gene 
that encodes a selectable marker {e.g,, resistance to antibiotics) is generally introduced into 
the host cells along with the gene of interest. Various selectable markers include those that 
confer resistance to drugs, such as G418, hygromycin and methotrexate. Nucleic acid 
encoding a selectable marker can be introduced into a host cell on the same vector as that 
encodmgNOVX or can be introduced on a separate vector. Cells stably transfected with the 
introduced nucleic acid can be identified by drug selection (e.g., cells that have incorporated 
the selectable marker gene will survive, while the other cells die). 

A host cell of the invention, such as a prokaryotic or eukaryotic host cell in culture, 
can be used to produce (te,, express) NOVX protein. Accordingly, the invention further 
provides methods for producing NOVX protein using the host cells of the invention. In one 
embodiment, the method comprises culturing the host cell of invention (into which a 
recombinant expression vector encoding NOVX protein has been introduced) in a suitable 
medium such that NOVX protein is produced. In another embodiment, the method further 
comprises isolating NOVX protein from the medium or the host cell. 

Transgenic NOVX Animals 

The host cells of the invention can also be used to produce non-human transgenic 
animals. For example, in one embodiment, a host cell of the invention is a fertilized oocyte 
or an embryonic stem cell into which NOVX protein-coding sequences have been introduced. 
Such host cells can then be used to create non-human transgenic animals in which exogenous 
NOVX sequences have been introduced into their genome or homologous recombinant 
animals in which endogenous NOVX sequences have been altered. Such animals are useful 
for studying the function and/or activity of NOVX protein and for identifying and/or 
evaluating modulators of NOVX protein activity. As used herein, a "transgenic animal" is a 
non-human animal, preferably a mammal, more preferably a rodent such as a rat or mouse, in 
which one or more of the cells of the animal includes a transgene. Other examples of 
transgenic animals include non-human primates, sheep, dogs, cows, goats, chickens, 
amphibians, etc. A transgene is exogenous DNA that is integrated into the genome of a cell 
from which a transgenic animal develops and that remains in the genome of the mature 
animal, thereby directing the expression of an encoded gene product in one or more cell types 
or tissues of the transgenic animal. As used herein, a "homologous recombinant animal" is a 
non-human animal, preferably a mammal, more preferably a mouse, in which an endogenous 
NOVX gene has been altered by homologous recombination between the endogenous gene 
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and an exogenous DNA molecule introduced into a cell of the animal, e.g., an embryonic cell 
of the animal, prior to development of the animaL 

A transgenic animal of the invention can be created by introducing NOVX-encoding 
nucleic acid into the male pronuclei of a fertilized oocyte (e.g., by microinjection, retroviral 
infection) and allowing the oocyte to develop in a pseudopregnant female foster animaL The 
human NOVX cDNA sequences SEQ IDNOSil, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 
29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77, 
79, 81, 83, 85, 87, 89, 91, 93, 95, 97, 99, 101, 103, 105, 107, 109 and 111 can be introduced 
as a transgene into the genome of a non-human animaL Alternatively, a non-human 
homologue of the human NOVX gene, such as a mouse NOVX gene, can be isolated based 
on hybridization to the human NOVX cDNA (described further supra) and used as a 
transgene. Intronic sequences and polyadenylation signals can also be included in the 
transgene to increase the efficiency of expression of the transgene. A tissue-specific 
regulatory sequence(s) can be operably-linked to the NOVX transgene to direct expression of 
NOVX protein to particular cells. Methods for generating transgenic animals via embryo 
manipulation and microinjection, particularly animals such as mice, have become 
conventional in the art and are described, for example, in U.S. Patent Nos. 4,736,866; 
4,870,009; and 4,873,191; andHogan, 1986. In: Manipulating the MOUSE Embryo, Cold 
Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. Similar methods are used for 
production of other transgenic animals. A transgenic founder animal can be identified based 
upon the presence of the NOVX transgene in its genome and/or expression of NOVX mRNA 
in tissues or cells of the animals. A transgenic founder animal can then be used to breed 
additional animals carrying the transgene. Moreover, transgenic animals carrying a 
transgene-encoding NOVX protein can further be bred to other transgenic animals carrying 
other transgenes. 

To create a homologous recombinant animal, a vector is prepared which contains at 
least a portion of an NOVX gene into which a deletion, addition or substitution has been 
introduced to thereby alter, e.g., functionally disrupt, the NOVX gene. The NOVX gene can 
be a human gene (e.g., the cDNA of SEQ ID NOS:l, 3, 5, 7, 9, 1 1, 13, 15, 17, 19, 21, 23, 25, 
27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 
77, 79, 81, 83, 85, 87, 89, 91, 93, 95, 97, 99, 101, 103, 105, 107, 109 and 111), but more 
preferably, is a non-human homologue of a human NOVX gene. For example, a mouse 
homologue of human NOVX gene of SEQ ID NOS:l, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 

25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 
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75, 77, 79, 81, 83, 85, 87, 89, 91, 93, 95, 97, 99, 101, 103, 105, 107, 109 and 1 1 1 can be used 
to construct a homologous recombination vector suitable for altering an endogenous NOVX 
gene in the mouse genome. In one embodiment, the vector is designed such that, upon 
homologous recombination, the endogenous NOVX gene is functionally disrupted (/.e., no 
longer encodes a functional protein; also referred to as a "knock out" vector). 

Alternatively, the vector can be designed such that, upon homologous recombination, 
the endogenous NOVX gene is mutated or otherwise altered but still encodes functional 
protein (e.g., the upstream regulatory region can be altered to thereby alter the expression of 
the endogenous NOVX protein). In the homologous recombination vector, the altered 
portion of the NOVX gene is flanked at its 5'- and 3'-termini by additional nucleic acid of the 
NOVX gene to allow for homologous recombination to occur between the exogenous NOVX 
gene carried by the vector and an endogenous NOVX gene in an embryonic stem cell. The 
additional flanking NOVX nucleic acid is of sufficient length for successful homologous 
recombination with the endogenous gene. Typically, several kilobases of flanking DNA 
(both at the 5 - and 3*-termini) are included in the vector. See, e.g,, Thomas, et al, 1987. Cell 
51: 503 for a description of homologous recombination vectors. The vector is ten introduced 
into an embryonic stem cell line (e.g., by electroporation) and cells in which the introduced 
NOVX gene has homologously-recombined with the endogenous NOVX gene are selected. 
See, e.g, Li, et al, 1992. Cell 69: 915. 

The selected cells are then injected into a blastocyst of an animal (e.g., a mouse) to 
form aggregation chimeras. See, e.g., Bradley, 1987. In: Teratocarcinomas AND 
Embryonic Stem Cells: A Practical Approach, Robertson, ed. IRL, Oxford, pp. 
1 13-152. A chimeric embryo can then be implanted into a suitable pseudopregnant female 
foster animal and the embryo brought to term. Progeny harboring the homologously- 
recombined DNA in their germ cells can be used to breed animals in which all cells of the 
animal contain the homologously-recombined DNA by germline transmission of the 
transgene. Methods for constructing homologous recombination vectors and homologous 
recombinant animals are described further in Bradley, 1991. Curr. Opin, Biotechnol 2: 
823-829; PCT International Publication Nos.: WO 90/1 1354; WO 91/01 140; WO 92/0968; 
and WO 93/04169. 

In another embodiment, transgenic non-humans animals can be produced that contain 
selected systems that allow for regulated expression of the transgene. One example of such a 
system is the cre/loxP recombinase system of bacteriophage PI. For a description of the 
cre/loxP recombinase system. See, e.g., Lakso, et al, 1992. Proc, Natl Acad, ScL USA 89: 
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6232-6236. Another example of a recombinase system is the FLP recombinase system of 
Saccharomyces cerevisiae. See, O'Gorman, et al, 1991. Science 251:1351-1355. If a 
cre/IoxP recombinase system is used to regulate expression of the transgene, animals 
containing transgenes encoding both the Cre recombinase and a selected protein are required. 
Such animals can be provided through the construction of "double" transgenic animals, e.g., 
by mating two transgenic animals, one containing a transgene encoding a selected protein and 
the other containing a transgene encoding a recombinase. 

Clones of the non-human transgenic animals described herein can also be produced 
according to the methods described in Wilmut, et al, 1997. Nature 385: 810-813. In brief, a 
cell (e.g., a somatic cell) from the transgenic animal can be isolated and induced to exit the 
growth cycle and enter Go phase. The quiescent cell can then be fused, e.g., through the use 
of electrical pulses, to an enucleated oocyte from an animal of the same species from which 
the quiescent cell is isolated. The reconstructed oocyte is then cultured such that it develops 
to morula or blastocyte and then transferred to pseudopregnant female foster animal. The 
offspring borne of this female foster animal will be a clone of the animal from which the cell 
(e.g., the somatic cell) is isolated. 

Pharmaceutical Compositions 

The NOVX nucleic acid molecules, NOVX proteins, and anti-NOVX antibodies (also 
referred to herein as "active compounds") of the invention, and derivatives, fragments, 
analogs and homologs thereof, can be incorporated into pharmaceutical compositions suitable 
for administration. Such compositions typically comprise the nucleic acid molecule, protein, 
or antibody and a pharmaceutically acceptable carrier. As used herein, "pharmaceutically 
acceptable carrier" is intended to include any and all solvents, dispersion media, coatmgs, 
antibacterial and antifungal agents, isotonic and absorption delaying agents, and the like, 
compatible with pharmaceutical administration. Suitable carriers are described in the most 
recent edition of Remington's Pharmaceutical Sciences, a standard reference text in the field, 
which is incorporated herein by reference. Preferred examples of such carriers or diluents 
include, but are not limited to, water, saline, finger's solutions, dextrose solution, and 5% 
human serum albumin. Liposomes and non-aqueous vehicles such as fixed oils may also be 
used. The use of such media and agents for pharmaceutically active substances is well 
known in the art. Except insofar as any conventional media or agent is incompatible with the 
active compound, use thereof in the compositions is contemplated. Supplementary active 
compounds can also be incorporated into the compositions. 
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A phannaceutical composition of the invention is formulated to be compatible with its 
intended route of administration. Examples of routes of administration include parenteral, 
e.g., intravenous, intradermal, subcutaneous, oral {e.g., inhalation), transdermal (Le., topical), 
transmucosal, and rectal administration. Solutions or suspensions used for parenteral, 
intradermal, or subcutaneous application can include the following components: a sterile 
diluent such as water for injection, saline solution, fixed oils, polyethylene glycols, glycerine, 
propylene glycol or other synthetic solvents; antibacterial agents such as benzyl alcohol or 
methyl parabens; antioxidants such as ascorbic acid or sodium bisulfite; chelating agents such 
as ethylenediaminetetraacetic acid (EDTA); buffers such as acetates, citrates or phosphates, 
and agents for the adjustment of tonicity such as sodium chloride or dextrose. The pH can be 
adjusted with acids or bases, such as hydrochloric acid or sodium hydroxide. The parenteral 
preparation can be enclosed in ampoules, disposable syringes or multiple dose vials made of 
glass or plastic. 

Pharmaceutical compositions suitable for injectable use include sterile aqueous 
solutions (where water soluble) or dispersions and sterile powders for the extemporaneous 
preparation of sterile injectable solutions or dispersion. For intravenous administration, 
suitable carriers include physiological saline, bacteriostatic water, Cremophor EL^"" (BASF, 
Parsippany, N.J.) or phosphate buffered saline (PBS). In all cases, the composition must be 
sterile and should be fluid to the extent that easy syringeability exists. It must be stable under 
the conditions of manufacture and storage and must be preserved against the contaminating 
action of microorganisms such as bacteria and fungi. The carrier can be a solvent or 
dispersion medium containing, for example, water, ethanol, polyol (for example, glycerol, 
propylene glycol, and liquid polyethylene glycol, and the like), and suitable mixtures thereof. 
The proper fluidity can be maintained, for example, by the use of a coating such as lecithin, 
by the maintenance of the required particle size in the case of dispersion and by the use of 
surfactants. Prevention of the action of microorganisms can be achieved by various 
antibacterial and antiftmgal agents, for example, parabens, chlorobutanol, phenol, ascorbic 
acid, thimerosal, and the like. In many cases, it will be preferable to include isotonic agents, 
for example, sugars, polyalcohols such as manitol, sorbitol, sodium chloride in the 
composition. Prolonged absorption of the injectable compositions can be brought about by 
including in the composition an agent which delays absorption, for example, aluminum 
monostearate and gelatin. 

Sterile injectable solutions can be prepared by incorporating the active compound 

(e.g., an NOVX protein or anti-NOVX antibody) in the required amount in an appropriate 
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solvent with one or a combination of ingredients enumerated above, as required, followed by 
filtered sterilization. Generally, dispersions are prepared by incorporating the active 
compound into a sterile vehicle that contains a basic dispersion medium and the required 
other ingredients from those enumerated above. In the case of sterile powders for the 
preparation of sterile injectable solutions, methods of preparation are vacuum drying and 
freeze-drying that yields a powder of the active ingredient plus any additional desired 
ingredient from a previously sterile-filtered solution thereof. 

Oral compositions generally include an inert diluent or an edible carrier. They can be 
enclosed in gelatin capsules or compressed into tablets. For the purpose of oral therapeutic 
administration, the active compound can be incorporated with excipients and used in the form 
of tablets, troches, or capsules. Oral compositions can also be prepared using a fluid carrier 
for use as a mouthwash, wherein the compound in the fluid carrier is applied orally and 
swished and expectorated or swallowed. Pharmaceutically compatible binding agents, and/or 
adjuvant materials can be included as part of the composition. The tablets, pills, capsules, 
troches and the like can contain any of the following ingredients, or compounds of a similar 
nature: a binder such as microcrystalline cellulose, gum tragacanth or gelatin; an excipient 
such as starch or lactose, a disintegrating agent such as alginic acid, Primogel, or corn starch; 
a lubricant such as magnesium stearate or Sterotes; a glidant such as colloidal silicon dioxide; 
a sweetening agent such as sucrose or saccharin; or a flavoring agent such as peppermint, 
methyl salicylate, or orange flavoring. 

For administration by inhalation, the compounds are delivered in the form of an 
aerosol spray from pressured container or dispenser which contains a suitable propellant, e.g., 
a gas such as carbon dioxide, or a nebulizer. 

Systemic administration can also be by transmucosal or transdermal means. For 
transmucosal or transdermal administration, penetrants appropriate to the barrier to be 
permeated are used in the formulation. Such penetrants are generally known in the art, and 
include, for example, for transmucosal administration, detergents, bile salts, and fusidic acid 
derivatives. Transmucosal administration can be accomplished through the use of nasal 
sprays or suppositories. For transdermal administration, the active compounds are 
formulated into ointments, salves, gels, or creams as generally known in the art. 

The compounds can also be prepared in the form of suppositories {e.g., with 
conventional suppository bases such as cocoa butter and other glycerides) or retention 
enemas for rectal delivery. 
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In one embodiment, the active compounds are prepared with carriers that will protect 
the compound against rapid elimination from tfie body, such as a controlled release 
formulation, including implants and microencapsulated delivery systems. Biodegradable, 
biocompatible polymers can be used, such as ethylene vinyl acetate, polyanhydrides, 
polyglycolic acid, collagen, polyorthoesters^ and polylactic acid. Methods for preparation of 
such formulations will be apparent to tihose skilled in the art. The materials can also be 
obtained commercially from Alza Corporation and Nova Pharmaceuticals, Inc. Liposomal 
suspensions (including liposomes targeted to infected cells with monoclonal antibodies to 
viral antigens) can also be used as pharmaceutically acceptable carriers. These can be 
prepared according to methods known to those skilled in the art, for example, as described in 
U.S. Patent No. 4,522,81 1. 

It is especially advantageous to formulate oral or parenteral compositions in dosage 
unit form for ease of administration and uniformity of dosage. Dosage unit form as used 
herein refers to physically discrete units suited as unitary dosages for the subject to be 
treated; each unit containing a predetermined quantity of active compound calculated to 
produce the desired therapeutic effect in association with the required pharmaceutical carrier. 
The specification for the dosage unit forms of the invention are dictated by and directly 
dependent on the unique characteristics of the active compound and the particular therapeutic 
effect to be achieved, and the limitations inherent in the art of compounding such an active 
compound for the treatment of individuals. 

The nucleic acid molecules of the invention can be inserted into vectors and used as 
gene therapy vectors. Gene therapy vectors can be delivered to a subject by, for example, 
intravenous injection, local administration {see, e.g., U.S. Patent No. 5,328,470) or by 
stereotactic injection {see, e.g., Chen, et al, 1994. Proc. Natl Acad Set USA 91 : 
3054-3057). The pharmaceutical preparation of the gene therapy vector can include the gene 
therapy vector in an acceptable diluent, or can comprise a slow release matrix in which the 
gene delivery vehicle is imbedded. Alternatively, where the complete gene delivery vector 
can be produced intact from recombinant cells, e.g., retroviral vectors, the pharmaceutical 
preparation can include one or more cells that produce the gene delivery system. 

The pharmaceutical compositions can be included in a container, pack, or dispenser 
together with instructions for administration. 
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Screening and Detection Methods 

The isolated nucleic acid molecules of fte invention can be used to express NOVX 
protein (e.g., via a recombinant expression vector in a host cell in gene therapy applications), 
to detect NOVX mRNA (e.g., in a biological sample) or a genetic lesion in an NOVX gene, 
and to modulate NOVX activity, as described further, below. In addition, the NOVX proteins 
can be used to screen drugs or compounds that modulate the NOVX protein activity or 
expression as well as to treat disorders characterized by insufficient or excessive production 
of NOVX protein or production of NOVX protein forms that have decreased or aberrant 
activity compared to NOVX wild-type protein (e.g.; diabetes (regulates insulin release); 
obesity (binds and transport lipids); metabolic disturbances associated with obesity, the 
metabolic syndrome X as well as anorexia and wasting disorders associated with chronic 
diseases and various cancers, and infectious disease(possesses anti-microbial activity) and the 
various dyslipidemias. In addition, the anti-NOVX antibodies of the invention can be used to 
detect and isolate NOVX proteins and modulate NOVX activity. In yet a further aspect, the 
invention can be used in methods to influence appetite, absorption of nutrients and the 
disposition of metabolic substrates in both a positive and negative fashion. 

The invention further pertains to novel agents identified by the screening assays 
described herein and uses thereof for treatments as described, supra. 

Screening Assays 

The invention provides a method (also referred to herein as a "screening assay") for 
identifying modulators, Le., candidate or test compounds or agents (e.g., peptides, 
peptidomimetics, small molecules or other drugs) that bind to NOVX proteins or have a 
stimulatory or inhibitory effect on, e.g., NOVX protein expression or NOVX protein activity. 
The invention also includes compounds identified in the screening assays described herein. 

In one embodiment, the invention provides assays for screening candidate or test 
compounds which bind to or modulate the activity of the membrane-bound form of an NOVX 
protein or polypeptide or biologically-active portion thereof The test compounds of the 
invention can be obtained using any of the numerous approaches in combinatorial library 
methods known in the art, including: biological libraries; spatially addressable parallel solid 
phase or solution phase libraries; synthetic library methods requiring deconvolution; the 
"one-bead one-compound" library method; and synthetic library methods using affinity 
chromatography selection. The biological library approach is limited to peptide libraries. 
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while the other four approaches are applicable to peptide, non-peptide oligomer or small 
molecule libraries of compounds. See, e.g.. Lam, 1997. Anticancer Drug Design 12: 145, 

A "small molecule" as used herein, is meant to refer to a composition that has a 
molecular weight of less than about 5 kD and most preferably less than about 4 kD. Small 
molecules can be, e.g., nucleic acids, peptides, polypeptides, peptidomimetics, carbohydrates, 
lipids or other organic or inorganic molecules. Libraries of chemical and/or biological 
mixtures, such as fungal, bacterial, or algal extracts, are known in the art and can be screened 
with any of the assays of the invention. 

Examples of methods for the synthesis of molecular libraries can be found in the art, 
for example in: DeWitt, et al, 1 993. Proc. Natl Acad. Sci. U.S.A. 90: 6909; Erb, et aL, 1994. 
Proc. Natl Acad. Sci. USA. 91:1 1422; Zuckermann, et al., 1994. J. Med Chem. 37: 2678; 
Cho, et al, 1993. Science 261 : 1303; Carrell, et al, 1994. Angew. Chem. Int. Ed. Engl 33: 
2059; Carell, et al, l994.Angew. Chem. Int. Ed. Engl 33:2061;and Gallop, a/., 1994. J. 
Med. Chem. 37: 1233. 

Libraries of compounds may be presented in solution (e.g., Houghten, 1992. 
Biotechniques 13: 412-421), or on beads (Lam, 1991. Nature 354: 82-84), on chips (Fodor, 
1993. Nature 364: 555-556), bacteria (Ladner, U.S. Patent No. 5,223,409), spores (Ladner, 
U.S. Patent 5,233,409), plasmids (Cull, et al, 1992. Proc. Natl Acad. Sci. USA 89: 
1 865-1 869) or on phage (Scott and Smith, 1 990. Science 249: 386-390; Devlin, 1 990. Science 
249: 404-406; Cwirla, et al, 1990. Proc. Natl Acad. Set U.S.A. 87: 6378-6382; Felici, 1991. 
J. Mol Biol 222: 301-310; Ladner, U.S. Patent No. 5,233,409.). 

In one embodiment, an assay is a cell-based assay in which a cell which expresses a 
membrane-bound form of NOVX protein, or a biologically-active portion thereof, on the cell 
surface is contacted with a test compound and the ability of the test compound to bind to an 
NOVX protein determined. The cell, for example, can of mammalian origin or a yeast cell. 
Determining the ability of the test compound to bind to the NOVX protein can be 
accomplished, for example, by coupling the test compound with a radioisotope or enzymatic 
label such that binding of the test compound to the NOVX protein or biologically-active 
portion thereof can be determined by detecting the labeled compound in a complex. For 
example, test compounds can be labeled with ^^^I, ^^S, ^"^C, or ^H, either directly or indirectly, 
and the radioisotope detected by direct counting of radioemission or by scintillation counting. 
Alternatively, test compounds can be enzymatically-labeled with, for example, horseradish 
peroxidase, alkaline phosphatase, or luciferase, and the enzymatic label detected by 
determination of conversion of an appropriate substrate to product. In one embodiment, the 
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assay comprises contacting a cell which expresses a membrane-bound form of NOVX 

protein, or a biologically-active portion thereof, on the cell surface with a known compound 

which binds NOVX to form an assay mixture, contacting the assay mixture with a test 

compound, and determining the ability of the test compound to interact with an NOVX 

5 protein, wherein determining the ability of the test compound to interact with an NOVX 

protein comprises determining the ability of the test compound to preferentially bind to 

NOVX protein or a biologically-active portion thereof as compared to the known compound. 

In another embodiment, an assay is a cell-based assay comprising contacting a cell 

; expressing a membrane-bound form of NOVX protein, or a biologically-active portion 

Q 10 thereof, on the cell surface with a test compound and determining the ability of the test 

M compound to modulate (e.g., stimulate or inhibit) the activity of the NOVX protein or 

«;p biologically-active portion thereof Determining the ability of the test compound to modulate 

ri the activity of NOVX or a biologically-active portion thereof can be accomplished, for 

03 example, by determining the ability of the NOVX protein to bind to or interact with an 

15 NOVX target molecule. As used herein, a "target molecule" is a molecule with which an 

NOVX protein binds or interacts in nature, for example, a molecule on the surface of a cell 

ni which expresses an NOVX interacting protein, a molecule on the surface of a second cell, a 

|=f molecule in the extracellular milieu, a molecule associated with the internal surface of a cell 

membrane or a cytoplasmic molecule. An NOVX target molecule can be a non-NOVX 

20 molecule or an NOVX protein or polypeptide of the invention. In one embodiment, an 

NOVX target molecule is a component of a signal transduction pathway that facilitates 

transduction of an extracellular signal (e.g. a signal generated by binding of a compound to a 

membrane-bound NOVX molecule) through the cell membrane and into the cell. The target, 

for example, can be a second intercellular protein that has catalytic activity or a protein that 

25 facilitates the association of downstream signaling molecules with NOVX. 

Determining the ability of the NOVX protein to bind to or interact with an NOVX 

target molecule can be accomplished by one of the methods described above for determining 

direct binding. In one embodiment, determining the ability of the NOVX protein to bind to or 

interact with an NOVX target molecule can be accomplished by determining the activity of 

30 the target molecule. For example, the activity of the target molecule can be determined by 

detecting induction of a cellular second messenger of the target intracellular Ca^^, 

diacylglycerol, IP3, etc.), detecting catalytic/enzymatic activity of the target an appropriate 

substrate, detecting the induction of a reporter gene (comprising an NOVX-responsive 

regulatory element operatively linked to a nucleic acid encoding a detectable marker, e.g., 
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luciferase), or detecting a cellular response, for example, cell survival, cellular differentiation, 
or cell proliferation. 

In yet another embodiment, an assay of the invention is a cell-free assay comprising 
contacting an NOVX protein or biologically-active portion thereof with a test compound and 
determining the ability of the test compound to bind to the NOVX protein or biologically- 
active portion thereof. Binding of the test compound to the NOVX protein can be determined 
either directly or indirectly as described above. In one such embodiment, the assay comprises 
contacting the NOVX protein or biologically-active portion thereof with a known compound 
which binds NOVX to form an assay mixture, contacting the assay mixture with a test 
compound, and determining the ability of the test compound to interact with an NOVX 
protein, wherein determining the ability of the test compound to interact with an NOVX 
protein comprises determining the ability of the test compound to preferentially bind to 
NOVX or biologically-active portion thereof as compared to the known compound. 

In still another embodiment, an assay is a cell-free assay comprising contacting 
NOVX protein or biologically-active portion thereof with a test compound and determining 
the ability of the test compound to modulate (e.g. stimulate or inhibit) the activity of the 
NOVX protein or biologically-active portion thereof Determining the ability of the test 
compound to modulate the activity of NOVX can be accomplished, for example, by 
determining the ability of the NOVX protein to bind to an NOVX target molecule by one of 
the methods described above for determining direct binding. In an alternative embodiment, 
determining the ability of the test compound to modulate the activity of NOVX protein can 
be accomplished by determining the ability of the NOVX protein further modulate an NOVX 
target molecule. For example, the catalytic/enzymatic activity of the target molecule on an 
appropriate substrate can be determined as described, supra. 

In yet another embodiment, the cell-free assay comprises contacting the NOVX 
protein or biologically-active portion thereof with a known compound which binds NOVX 
protein to form an assay mixture, contacting the assay mixture with a test compound, and 
determining the ability of the test compound to interact with an NOVX protein, wherein 
determining the ability of the test compound to interact with an NOVX protein comprises 
determining the ability of the NOVX protein to preferentially bind to or modulate the activity 
of an NOVX target molecule. 

The cell-free assays of the invention are amenable to use of both the soluble form or 
the membrane-bound form of NOVX protein. In the case of cell-free assays comprising the 
membrane-bound form of NOVX protein, it may be desirable to utilize a solubilizing agent 

339 



such that the membrane-bound form of NOVX protein is maintained in solution. Examples 
of such solubilizing agents include non-ionic detergents such as n~octyIglucoside, 
n-dodecylglucoside, n-dodecylmaltoside, octanoyl-N-methylglucamide, 
decanoyl-N-methylglucamide, Triton® X-100, Triton® X-1 14, Thesit® 
Isotridecypoly(ethylene glycol ether)„, N-dodecyl~N,N-dimethyl-3-ammonio-l -propane 
sulfonate, 3-(3-cholamidopropyl) dimethyIamminiol-1 -propane sulfonate (CHAPS), or 
3-(3-cholamidopropyl)dimethylamminiol-2-hydroxy-l -propane sulfonate (CHAPSO). 

In more than one embodiment of the above assay methods of the invention, it may be 
desirable to immobilize either NOVX protein or its target molecule to facilitate separation of 
complexed from uncomplexed forms of one or both of the proteins, as well as to 
accommodate automation of the assay. Binding of a test compound to NOVX protein, or 
interaction of NOVX protein with a target molecule in the presence and absence of a 
candidate compound, can be accomplished in any vessel suitable for containing the reactants. 
Examples of such vessels include microtiter plates, test tubes, and micro-centrifuge tubes. In 
one embodiment, a fusion protein can be provided that adds a domain that allows one or both 
of the proteins to be bound to a matrix. For example, GST-NOVX fusion proteins or GST- 
target fusion proteins can be adsorbed onto glutathione sepharose beads (Sigma Chemical, St 
Louis, MO) or glutathione derivatized microtiter plates, that are then combined with the test 
compound or the test compound and either the non-adsorbed target protein or NOVX protein, 
and the mixture is incubated under conditions conducive to complex formation (e.g., at 
physiological conditions for salt and pH). Following incubation, the beads or microtiter plate 
wells are washed to remove any unbound components, the matrix immobilized in the case of 
beads, complex determined either directly or indirectly, for example, as described, supra. 
Alternatively, the complexes can be dissociated from the matrix, and the level of NOVX 
protein binding or activity determined using standard techniques. 

Other techniques for immobilizing proteins on matrices can also be used in the 
screening assays of the invention. For example, either the NOVX protein or its target 
molecule can be immobilized utilizing conjugation of biotin and streptavidin. Biotinylated 
NOVX protein or target molecules can be prepared from biotin-NHS 

(N-hydroxy-succinimide) using techniques well-known within the art (e.g., biotinylation kit. 

Pierce Chemicals, Rockford, 111.), and immobilized in the wells of streptavidin-coated 96 well 

plates (Pierce Chemical). Alternatively, antibodies reactive with NOVX protein or target 

molecules, but which do not interfere with binding of the NOVX protein to its target 

molecule, can be derivatized to the wells of the plate, and unbound target or NOVX protein 
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trapped in the wells by antibody conjugation. Methods for detecting such complexes, in 
addition to those described above for the GST-immobilized complexes, include 
immunodetection of complexes using antibodies reactive with the NOVX protein or target 
molecule, as well as enzyme-linked assays that rely on detecting an enzymatic activity 
associated with the NOVX protein or target molecule. 

In another embodiment, modulators of NOVX protein expression are identified in a 
method wherein a cell is contacted with a candidate compound and the expression of NOVX 
mRNA or protein in the cell is determined. The level of expression of NOVX mRNA or 
protein in the presence of the candidate compound is compared to the level of expression of 
NOVX mRNA or protein in the absence of the candidate compound. The candidate 
compound can then be identified as a modulator of NOVX mRNA or protein expression 
based upon this comparison. For example, when expression of NOVX mRNA or protein is 
greater (/.e., statistically significantly greater) in the presence of the candidate compound than 
in its absence, the candidate compound is identified as a stimulator of NOVX mRNA or 
protein expression. Alternatively, when expression of NOVX mRNA or protein is less 
(statistically significantly less) in the presence of the candidate compound than in its absence, 
the candidate compound is identified as an inhibitor of NOVX mRNA or protein expression. 
The level of NOVX mRNA or protein expression in the cells can be determined by methods 
described herein for detecting NOVX mRNA or protein. 

In yet another aspect of the invention, the NOVX proteins can be used as "bait 
proteins" in a two-hybrid assay or three hybrid assay {see, e.g., U.S. Patent No. 5,283,3 1 7; 
Zervos, et al, 1993. Cell 72: 223-232; Madura, et al, 1993. J. Biol Chem. 268: 
12046-12054; Bartel, etal, 1993. Biotechniques 14: 920-924; Iwabuchi, et al, 1993. 
Oncogene 8: 1693-1696; and Brent WO 94/10300), to identify other proteins that bind to or 
interact with NOVX ("NOVX-binding proteins" or "NOVX-bp") and modulate NOVX 
activity. Such NOVX-binding proteins are also likely to be involved in the propagation of 
signals by the NOVX proteins as, for example, upstream or downstream elements of the 
NOVX pathway. 

The two-hybrid system is based on the modular nature of most transcription factors, 

which consist of separable DNA-binding and activation domains. Briefly, the assay utilizes 

two different DNA constructs. In one construct, the gene that codes for NOVX is fused to a 

gene encoding the DNA binding domain of a known transcription factor {e.g., GAL-4). In 

the other construct, a DNA sequence, from a library of DNA sequences, that encodes an 

unidentified protein ("prey" or "sample") is fused to a gene that codes for the activation 
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domain of the known transcription factor. If the 'T)ait" and the "prey" proteins are able to 
interact, in vfvo, forming an NOVX-dependent complex, the DNA-binding and activation 
domains of the transcription factor are brought into close proximity. This proximity allows 
transcription of a reporter gene (e.g., LacZ) that is operably linked to a transcriptional 
regulatory site responsive to the transcription factor. Expression of the reporter gene can be 
detected and cell colonies containing the functional transcription factor can be isolated and 
used to obtain the cloned gene that encodes the protein which interacts with NOVX. 

The invention further pertains to novel agents identified by the aforementioned 
screening assays and uses thereof for treatments as described herein. 

Detection Assays 

Portions or fragments of the cDNA sequences identified herein (and the 
corresponding complete gene sequences) can be used in numerous ways as polynucleotide 
reagents. By way of example, and not of limitation, these sequences can be used to: (/) map 
their respective genes on a chromosome; and, thus, locate gene regions associated with 
genetic disease; (ii) identify an individual from a minute biological sample (tissue typing); 
and (Hi) aid in forensic identification of a biological sample. Some of these applications are 
described in the subsections, below. 

Chromosome Mapping 

Once the sequence (or a portion of the sequence) of a gene has been isolated, this 
sequence can be used to map the location of the gene on a chromosome. This process is 
called chromosome mapping. Accordingly, portions or fragments of the NOVX sequences, 
SEQ ID NOS:l, 3, 5, 7, 9, 1 1, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 
47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77, 79, 81, 83, 85, 87, 89, 91, 93, 95, 
97, 99, 101, 103, 105, 107, 109 and 1 1 1, or fragments or derivatives thereof, can be used to 
map the location of the NOVX genes, respectively, on a chromosome. The mapping of the 
NOVX sequences to chromosomes is an important first step in correlating these sequences 
with genes associated with disease. 

Briefly, NOVX genes can be mapped to chromosomes by preparing PGR primers 
(preferably 15-25 bp in length) from the NOVX sequences. Computer analysis of the 
NOVX, sequences can be used to rapidly select primers that do not span more than one exon 
in the genomic DNA, thus complicating the amplification process. These primers can then be 
used for PCR screening of somatic cell hybrids containing individual human chromosomes. 
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Only tfiose hybrids containing the human gene corresponding to the NOVX sequences will 
yield an amplified fragment. 

Somatic cell hybrids are prepared by fusing somatic cells from different mammals 
(e.g,y human and mouse cells)* As hybrids of human and mouse cells grow and divide, they 
, gradually lose human chromosomes in random order, but retain the mouse chromosomes. By 
using media in which mouse cells cannot grow, because they lack a particular enzyme, but in 
which human cells can, the one human chromosome that contains the gene encoding the 
needed enzyme will be retained. By using various media, panels of hybrid cell lines can be 
established. Each cell line in a panel contains either a single human chromosome or a small 
number of human chromosomes, and a full set of mouse chromosomes, allowing easy 
mapping of individual genes to specific human chromosomes. See, e.g., D'Eustachio, et al., 
1983. Science 220: 919-924. Somatic cell hybrids containing only fragments of human 
chromosomes can also be produced by using human chromosomes with translocations and 
deletions, 

PCR mapping of somatic cell hybrids is a rapid procedure for assigning a particular 
sequence to a particular chromosome. Three or more sequences can be assigned per day 
using a single thermal cycler. Using the NOVX sequences to design oligonucleotide primers, 
sub-localization can be achieved with panels of fragments from specific chromosomes. 

Fluorescence in situ hybridization (FISH) of a DNA sequence to a metaphase 
chromosomal spread can further be used to provide a precise chromosomal location in one 
step. Chromosome spreads can be made using cells whose division has been blocked in 
metaphase by a chemical like colcemid that disrupts the mitotic spindle. The chromosomes 
can be treated briefly with trypsin, and then stained with Giemsa. A pattern of light and dark 
bands develops on each chromosome, so that the chromosomes can be identified individually. 
The FISH technique can be used with a DNA sequence as short as 500 or 600 bases. 
However, clones larger than 1,000 bases have a higher likelihood of binding to a unique 
chromosomal location with sufficient signal intensity for simple detection. Preferably 1,000 
bases, and more preferably 2,000 bases, will suffice to get good results at a reasonable 
amount of time. For a review of this technique, see, Verma, et al^ HUMAN CHROMOSOMES: 
A MANUAL OF Basic Techniques (Pergamon Press, New York 1 988). 

Reagents for chromosome mapping can be used individually to mark a single 
chromosome or a single site on that chromosome, or panels of reagents can be used for 
marking multiple sites and/or multiple chromosomes. Reagents corresponding to noncoding 
regions of the genes actually are preferred for mapping purposes. Coding sequences are more 
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likely to be conserved within gene families, thus increasing the chance of cross hybridizations 
during chromosomal mapping. 

Once a sequence has been mapped to a precise chromosomal location, the physical 
position of the sequence on the chromosome can be correlated with genetic map data. Such 
5 data are found, e,g,, in McKusick, Mendelian INHERITANCE IN MAN, available on-line 

through Johns Hopkins University Welch Medical Library). The relationship between genes 
and disease, mapped to the same chromosomal region, can then be identified through linkage 
analysis (co-inheritance of physically adjacent genes), described in, e.g., Egeland, et aL, 
, . 1987. Nature, 325: 783-787. 

53 10 Moreover, differences in the DNA sequences between individuals affected and 

unaffected with a disease associated with the NOVX gene, can be determined. If a mutation 
% is observed in some or all of the affected individuals but not in any unaffected individuals, 

Cj then the mutation is likely to be the causative agent of the particular disease. Comparison of 

affected and unaffected individuals generally involves first looking for structural alterations 
f ;j 15 in the chromosomes, such as deletions or translocations that are visible from chromosome 

spreads or detectable using PCR based on that DNA sequence. Ultimately, complete 
sequencing of genes from several individuals can be performed to confirm the presence of a 
mutation and to distinguish mutations from polymorphisms. 
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Tissue Typing 

20 The NOVX sequences of the invention can also be used to identify individuals from 

minute biological samples. In this technique, an individual's genomic DNA is digested with 
one or more restriction enzymes, and probed on a Southern blot to yield imique bands for 
identification. The sequences of the invention are useful as additional DNA markers for 
RFLP ("restriction fragment length polymorphisms,'* described in U.S. Patent No. 

25 5,272,057). 

Furthermore, the sequences of the invention can be used to provide an alternative 
technique that determines the actual basc-by-base DNA sequence of selected portions of an 
individual's genome. Thus, the NOVX sequences described herein can be used to prepare 
two PCR primers from the 5 - and 3 -termini of the sequences. These primers can then be 
30 used to amplify an individual's DNA and subsequently sequence it. 

Panels of corresponding DNA sequences from individuals, prepared in this manner, 
can provide unique individual identifications, as each individual will have a unique set of 
such DNA sequences due to allelic differences. The sequences of the invention can be used 
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to obtain such identification sequences from individuals and from tissue. The NOVX 
sequences of the invention uniquely represent portions of the human genome. Allelic 
variation occurs to some degree in the coding regions of these sequences, and to a greater 
degree in the noncoding regions. It is estimated that allelic variation between individual 
humans occurs with a frequency of about once per each 500 bases. Much of the allelic 
variation is due to single nucleotide polymorphisms (SNPs), which include restriction 
fragment length polymorphisms (RFLPs). 

Each of the sequences described herein can, to some degree, be used as a standard 
against which DNA from an individual can be compared for identification purposes. Because 
greater numbers of polymorphisms occur in the noncoding regions, fewer sequences are 
necessary to differentiate individuals. The noncoding sequences can comfortably provide 
positive individual identification with a panel of perhaps 10 to 1,000 primers that each yield a 
noncoding amplified sequence of 100 bases. If predicted coding sequences, such as those in 
SEQIDNOS:l,3, 5, 7, 9, 11, 13, 15, 17, 19,21,23,25, 27,29,31,33, 35, 37,39,41,43, 45, 
47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77, 79, 81, 83, 85, 87, 89, 91, 93, 95, 
97, 99, 101, 103, 105, 107, 109 and 1 1 1 are used, a more appropriate number of primers for 
positive individual identification would be 500-2,000. 

Predictive Medicine 

The invention also pertains to the field of predictive medicine in which diagnostic 
assays, prognostic assays, pharmacogenomics, and monitoring clinical trials are used for 
prognostic (predictive) purposes to thereby treat an individual prophylactically. Accordingly, 
one aspect of the invention relates to diagnostic assays for determining NOVX protein and/or 
nucleic acid expression as well as NOVX activity, in the context of a biological sample (e.g., 
blood, serum, cells, tissue) to thereby determine whether an individual is afflicted with a 
disease or disorder, or is at risk of developing a disorder, associated with aberrant NOVX 
expression or activity. The disorders include metabolic disorders, diabetes, obesity, infectious 
disease, anorexia, cancer-associated cachexia, cancer, neurodegenerative disorders, 
Alzheimer's Disease, Parkinson's Disorder, immune disorders, and hematopoietic disorders, 
and the various dyslipidemias, metabolic disturbances associated with obesity, the metabolic 
syndrome X and wasting disorders associated with chronic diseases and various cancers. The 
invention also provides for prognostic (or predictive) assays for determining whether an 
individual is at risk of developing a disorder associated with NOVX protein, nucleic acid 
expression or activity. For example, mutations in an NOVX gene can be assayed in a 
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biological sample. Such assays can be used for prognostic or predictive purpose to thereby 
prophylacticaliy treat an individual prior to the onset of a disorder characterized by or 
associated with NOVX protein, nucleic acid expression, or biological activity. 

Another aspect of the invention provides methods for determining NOVX protein, 
nucleic acid expression or activity in an individual to thereby select appropriate therapeutic or 
prophylactic agents for that individual (referred to herein as "pharmacogenomics"). 
Pharmacogenomics ailow^s for the selection of agents (e.g., drugs) for therapeutic or 
prophylactic treatment of an individual based on the genotype of the individual (e,g., the 
genotype of the individual examined to determine the ability of the individual to respond to a 
particular agent.) 

Yet another aspect of the invention pertains to monitoring the influence of agents 
(e.g., drugs, compounds) on the expression or activity of NOVX in clinical trials. 

These and other agents are described in further detail in the following sections. 

DuGNOSTic Assays 

An exemplary method for detecting the presence or absence of NOVX in a biological 
sample involves obtaining a biological sample from a test subject and contacting the 
biological sample with a compound or an agent capable of detecting NOVX protein or 
nucleic acid (e.g., mRNA, genomic DNA) that encodes NOVX protein such that the presence 
of NOVX is detected in the biological sample. An agent for detecting NOVX mRNA or 
genomic DNA is a labeled nucleic acid probe capable of hybridizing to NOVX mRNA or 
genomic DNA, The nucleic acid probe can be, for example, a full-length NOVX nucleic 
acid, such as the nucleic acid of SEQ ID NOS:l, 3, 5, 7, 9, 1 1, 13, 15, 17, 19, 21, 23, 25, 27, 
29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77, 
79, 81, 83, 85, 87, 89, 91, 93, 95, 97, 99, 101, 103, 105, 107, 109 and 111, oraportion 
thereof, such as an oligonucleotide of at least 15, 30, 50, 100, 250 or 500 nucleotides in 
length and sufficient to specifically hybridize under stringent conditions to NOVX mRNA or 
genomic DNA. Other suitable probes for use in the diagnostic assays of the invention are 
described herein. 

An agent for detecting NOVX protein is an antibody capable of binding to NOVX 
protein, preferably an antibody with a detectable label. Antibodies can be polyclonal, or 
more preferably, monoclonal. An intact antibody, or a fragment thereof (e.g.. Fab or F(ab')2) 
can be used. The term "labeled", with regard to the probe or antibody, is intended to 
encompass direct labeling of the probe or antibody by coupling (/.e., physically linking) a 
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detectable substance to tfie probe or antibody, as well as indirect labeling of the probe or 
antibody by reactivity with another reagent that is directly labeled. Examples of indirect 
labeling include detection of a primary antibody using a fluorescently-iabeled secondary 
antibody and end-labeling of a DNA probe with biotin such that it can be detected with 
5 fluorescently-iabeled streptavidin. The term "biological sample" is intended to include 

tissues, cells and biological fluids isolated from a subject, as well as tissues, cells and fluids 
present within a subject. That is, the detection method of the invention can be used to detect 
NOVX mRNA, protein, or genomic DNA in a biological sample in vitro as well as in vivo. 
For example, in vitro techniques for detection of NOVX mRNA include Northern 
10 hybridizations and in situ hybridizations. In vitro techniques for detection of NOVX protein 
£3 include enzyme linked immunosorbent assays (ELISAs), Western blots, 

j:« immunoprecipitations, and immunofluorescence. In vitro techniques for detection of NOVX 

genomic DNA include Southern hybridizations. Furthermore, 777 v/v£> techniques for 
detection of NOVX protein include introducing into a subject a labeled anti-NOVX antibody. 
15 For example, the antibody can be labeled with a radioactive marker whose presence and 
p j location in a subject can be detected by standard imaging techniques. 

In one embodiment, the biological sample contains protein molecules from the test 
p subject. Alternatively, the biological sample can contain mRNA molecules from the test 

''•'^ subject or genomic DNA molecules from the test subject. A preferred biological sample is a 

20 peripheral blood leukocyte sample isolated by conventional means from a subject. 

In another embodiment, the methods further involve obtaining a control biological 
sample from a control subject, contacting the control sample with a compound or agent 
capable of detecting NOVX protein, mRNA, or genomic DNA, such that the presence of 
NOVX protein, mRNA or genomic DNA is detected in the biological sample, and comparing 
25 the presence of NOVX protein, mRNA or genomic DNA in the control sample with the 
presence of NOVX protein, mRNA or genomic DNA in the test sample. 

The invention also encompasses kits for detecting the presence of NOVX in a 
biological sample. For example, the kit can comprise: a labeled compound or agent capable 
of detecting NOVX protein or mRNA in a biological sample; means for determining the 
30 amount of NOVX in the sample; and means for comparing the amount of NOVX in the 
sample with a standard. The compound or agent can be packaged in a suitable container. 
The kit can further comprise instructions for using the kit to detect NOVX protein or nucleic 
acid. 
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Prognostic Assays 

The diagnostic methods described herein can furthermore be utilized to identify 
subjects having or at risk of developing a disease or disorder associated with aberrant NOVX 
expression or activity. For example, the assays described herein, such as the preceding 
5 diagnostic assays or the following assays, can be utilized to identify a subject having or at 
risk of developing a disorder associated with NOVX protein, nucleic acid expression or 
activity. Alternatively, the prognostic assays can be utilized to identify a subject having or at 
risk for developing a disease or disorder. Thus, the invention provides a method for 
identifying a disease or disorder associated with aberrant NOVX expression or activity in 
y 10 which a test sample is obtained from a subject and NOVX protein or nucleic acid (e.g:, 

mRNA, genomic DNA) is detected, wherein the presence of NOVX protein or nucleic acid is 
'f': diagnostic for a subject having or at risk ofdeveloping a disease or disorder associated with 

"^4 aberrant NOVX expression or activity. As used herein, a "test sample" refers to a biological 

sample obtained from a subject of interest. For example, a test sample can be a biological 
C3 15 fluid (e.g., serum), cell sample, or tissue. 

rr Furthermore, the prognostic assays described herein can be used to determine whether 

vi- a subject can be administered an agent (e.g., an agonist, antagonist, peptidomimetic, protein, 

f^i peptide, nucleic acid, small molecule, or other drug candidate) to treat a disease or disorder 

associated with aberrant NOVX expression or activity. For example, such methods can be 
20 used to determine whether a subject can be effectively treated with an agent for a disorder. 
Thus, the invention provides methods for determining whether a subject can be effectively 
treated with an agent for a disorder associated with aberrant NOVX expression or activity in 
which a test sample is obtained and NOVX protein or nucleic acid is detected (e.g., wherein 
the presence of NOVX protein or nucleic acid is diagnostic for a subject that can be 
25 administered the agent to treat a disorder associated with aberrant NOVX expression or 
activity). 

The methods of the invention can also be used to detect genetic lesions in an NOVX 

gene, thereby determining if a subject with the lesioned gene is at risk for a disorder 

characterized by aberrant cell proliferation and/or differentiation. In various embodiments, 

30 the methods include detecting, in a sample of cells from the subject, the presence or absence 

of a genetic lesion characterized by at least one of an alteration affecting the integrity of a 

gene encoding an NOVX-protein, or the misexpression of the NOVX gene. For example, 

such genetic lesions can be detected by ascertaining the existence of at least one of: (i) a 

deletion of one or more nucleotides from an NOVX gene; (//) an addition of one or more 
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nucleotides to an NOVX gene; {Hi) a substitution of one or more nucleotides of an NOVX 

gene, (zV) a chromosomal rearrangement of an NOVX gene; (v) an alteration in the level of a 

messenger RNA transcript of an NOVX gene, (vz) aberrant modification of an NOVX gene, 

such as of the methylation pattern of the genomic DNA, (vii) the presence of a non-wild-type 

5 splicing pattern of a messenger RNA transcript of an NOVX gene, (v//7) a non-wild-type 

level of an NOVX protein, (ix) allelic loss of an NOVX gene, and (x) inappropriate 

post-translational modification of an NOVX protein. As described herein, there are a large 

number of assay techniques known in the art which can be used for detecting lesions in an 

NOVX gene. A preferred biological sample is a peripheral blood leukocyte sample isolated 

^ 10 by conventional means from a subject. However, any biological sample containing nucleated 

cells may be used, including, for example, buccal mucosal cells. 

In certain embodiments, detection of the lesion involves the use of a probe/primer in a 

\| polymerase chain reaction (PGR) (see, e.g., U.S. Patent Nos. 4,683,195 and 4,683,202), such 

as anchor PGR or RACE PGR, or, alternatively, in a ligation chain reaction (LCR) {see, e.g:, 

1 5 Landegran, et al, 1 988. Science 241 : 1 077-1 080; and Nakazawa, et al, 1994. Proc, Natl, 

Acad Sci, USA 91 : 360-364), the latter of which can be particularly useful for detecting point 

mutations in the NOVX-gene {see, Abravaya, et ah, 1995. Nucl. Acids Res. 23: 675-682). 

This method can include the steps of collecting a sample of cells from a patient, isolating 

nucleic acid {e.g., genomic, mRNA or both) from the cells of the sample, contacting the 

20 nucleic acid sample with one or more primers that specifically hybridize to an NOVX gene 

under conditions such that hybridization and amplification of the NOVX gene (if present) 

occurs, and detecting the presence or absence of an amplification product, or detecting the 

size of the amplification product and comparing the length to a control sample. It is 

anticipated that PGR and/or LGR may be desirable to use as a preliminary amplification step 

25 in conjunction with any of the techniques used for detecting mutations described herein. 

Alternative amplification methods include: self sustained sequence replication {see, 

Guatelli, et al, 1 990. Proc. Natl Acad. Sci. USA 87: 1 874-1 878), transcriptional 

amplification system {see, Kwoh, et al, 1989. Proc. Natl Acad. Set USA 86: 1 173-1 177); 

Q3 Replicase {see, Lizardi, et al, 1988. BioTechnology 6: 1 197), or any other nucleic acid 

30 amplification method, followed by the detection of the amplified molecules using techniques 

well known to those of skill in the art. These detection schemes are especially useful for the 

detection of nucleic acid molecules if such molecules are present in very low numbers. 

In an alternative embodiment, mutations in an NOVX gene from a sample cell can be 

identified by alterations in restriction enzyme cleavage patterns. For example, sample and 
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control DNA is isolated, amplified (optionally), digested with one or more restriction 

endonucleases, and fragment length sizes are determined by gel electrophoresis and 

compared. Differences in fragment length sizes between sample and control DNA indicates 

mutations in the sample DNA. Moreover, the use of sequence specific ribozymes {see, e.g., 

5 U.S. Patent No. 5,493,531) can be used to score for the presence of specific mutations by 

development or loss of a ribozyme cleavage site. 

In other embodiments, genetic mutations in NOVX can be identified by hybridizing a 

sample and control nucleic acids, e.g., DNA or RNA, to high-density arrays containing 

hundreds or thousands of oligonucleotides probes. See, e.g., Cronin, et al., 1996. Human 

10 Mutation 7: 244-255; Kozal, et al, 1996. Nat. Med. 2: 753-759. For example, genetic 

mutations in NOVX can be identified in two dimensional arrays containing light-generated 

DNA probes as described in Cronin, et al.^ supra. Briefly, a first hybridization array of 

probes can be used to scan through long stretches of DNA in a sample and control to identify 

base changes between the sequences by making linear arrays of sequential overlapping 

15 probes. This step allows the identification of point mutations. This is followed by a second 

hybridization array that allows the characterization of specific mutations by using smaller, 

specialized probe arrays complementary to all variants or mutations detected. Each mutation 

^ array is composed of parallel probe sets, one complementary to the wild-type gene and the 

other complementary to the mutant gene. 

20 In yet another embodiment, any of a variety of sequencing reactions known in the art 

can be used to directly sequence the NOVX gene and detect mutations by comparing the 

sequence of the sample NOVX with the corresponding wild-type (control) sequence. 

Examples of sequencing reactions include those based on techniques developed by Maxim 

and Gilbert, 1977. Proc. Natl. Acad. Sci. USA 74: 560 or Sanger, 1977. Proc. Natl Acad. ScL 

25 USA 74: 5463. It is also contemplated that any of a variety of automated sequencing 

procedures can be utilized when performing the diagnostic assays (see, e.g., Naeve, et aL, 

1995. Biotechniques 19: 448), including sequencing by mass spectrometry (see, e.g., PCT 

International Publication No. WO 94/16101; Cohen, et al, 1996. Adv. Chromatography 36: 

127-162; and Griffin, et al, 1993. Appl. Biochem. Biotechnol. 38: 147-159). 

30 Other methods for detecting mutations in the NOVX gene include methods in which 

protection from cleavage agents is used to detect mismatched bases in RNA/RNA or 

RNA/DNA heteroduplexes. See, e.g., Myers, et al, 1985. Science 230: 1242. In general, the 

art technique of "mismatch cleavage" starts by providing heteroduplexes of formed by 

hybridizing (labeled) RNA or DNA containing the wild-type NOVX sequence with 
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potentially mutant RNA or DNA obtained from a tissue sample. The double-stranded 
duplexes are treated with an agent that cleaves single-stranded regions of the duplex such as 
which will exist due to basepair mismatches between the control and sample strands. For 
instance, RNA/DNA duplexes can be treated with RNase and DNA/DNA hybrids treated 
5 with Si nuclease to enzymatically digesting the mismatched regions. In other embodiments, 
either DNA/DNA or RNA/DNA duplexes can be treated with hydroxylamine or osmium 
tetroxide and with piperidine in order to digest mismatched regions. After digestion of the 
mismatched regions, the resulting material is then separated by size on denaturing 
polyacrylamide gels to determine the site of mutation. See, e.g.. Cotton, et aU 1988. Proc. 
£3 10 Natl Acad. Set USA 85: 4397; Saleeba, et aL, 1992. Methods EnzymoL 217: 286-295. In an 

y-"^. embodiment, the control DNA or RNA can be labeled for detection. 

4^ In still another embodiment, the mismatch cleavage reaction employs one or more 

J"; proteins that recognize mismatched base pairs in double-stranded DNA (so called "DNA 

03 mismatch repair" enzymes) in defined systems for detecting and mapping point mutations in 

i 1 5 NOVX cDNAs obtained from samples of cells. For example, the mutY enzyme of E. coli 

n j cleaves A at G/A mismatches and the thymidine DNA glycosylase from HeLa cells cleaves T 

p I at G/T mismatches. See, e.g., Hsu, et ah, 1994. Carcinogenesis 15: 1657-1662. According to 

CI an exemplary embodiment, a probe based on an NOVX sequence, e.g., a wild-type NOVX 

rij ■ 

sequence, is hybridized to a cDNA or other DNA product from a test cell(s). The duplex is 
20 treated with a DNA mismatch repair enzyme, and the cleavage products, if any, can be 
detected from electrophoresis protocols or the like. See, e.g., U.S. Patent No. 5,459,039. 

In other embodiments, alterations in electrophoretic mobility will be used to identify 
mutations in NOVX genes. For example, single strand conformation polymorphism (SSCP) 
may be used to detect differences in electrophoretic mobility between mutant and wild type 
25 nucleic acids. See, e.g., Orita, et al, 1989. Proc. Natl. Acad. Sci. USA: 86: 2766; Cotton, 
1 993. Mutat. Res. 285: 125-144; Hayashi, 1 992. Genet. Anal Tech. Appl 9: 73-79. 
Single-stranded DNA fragments of sample and control NOVX nucleic acids will be 
denatured and allowed to renature. The secondary structure of single-stranded nucleic acids 
varies according to sequence, the resulting alteration in electrophoretic mobility enables the 
30 detection of even a single base change. The DNA fragments may be labeled or detected with 
labeled probes. The sensitivity of the assay may be enhanced by using RNA (rather than 
DNA), in which the secondary structure is more sensitive to a change in sequence. In one 
embodiment, the subject method utilizes heteroduplex analysis to separate double stranded 
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heteroduplex molecules on the basis of changes in electrophoretic mobility. See, e.g.. Keen, 
et al, 1991. Tremis Genet 7: 5. 

In yet another embodiment, the movement of mutant or wild-type fragments in 
polyacrylamide gels containing a gradient of denaturant is assayed using denaturing gradient 
gel electrophoresis (DGGE). See, e,g., Myers, et al, 1985. Nature 313: 495. When DGGE is 
used as the method of analysis, DNA will be modified to insure that it does not completely 
denature, for example by adding a GC clamp of approximately 40 bp of high-melting 
GC-rich DNA by PGR. In a further embodiment, a temperature gradient is used in place of a 
denaturing gradient to identify differences in the mobility of control and sample DNA. See, 
e.g., Rosenbaum and Reissner, 1987. Biophys. Chem. 265: 12753. 

Examples of other techniques for detecting point mutations include, but are not 
limited to, selective oligonucleotide hybridization, selective amplification, or selective primer 
extension. For example, oligonucleotide primers may be prepared in which the known 
mutation is placed centrally and then hybridized to target DNA under conditions that permit 
hybridization only if a perfect match is found. See, e.g., Saiki, et al^ 1986. Nature 324: 163; 
Saiki, et ah, 1989. Proc. Natl Acad. Set USA 86: 6230. Such allele specific oligonucleotides 
are hybridized to PGR amplified target DNA or a number of different mutations when the 
oligonucleotides are attached to the hybridizing membrane and hybridized with labeled target 
DNA. 

Alternatively, allele specific amplification technology that depends on selective PGR 
amplification may be used in conjunction with the instant invention. Oligonucleotides used 
as primers for specific amplification may carry the mutation of interest in the center of the 
molecule (so that amplification depends on differential hybridization; see, e.g., Gibbs, et al^ 
1989. Nucl. Acids Res. 17: 2437-2448) or at the extreme 3'-terminus of one primer where, 
under appropriate conditions, mismatch can prevent, or reduce polymerase extension {see, 
e.g., Prossner, 1993. Tibtech 1 1 : 238). In addition it may be desirable to introduce a novel 
restriction site in the region of the mutation to create cleavage-based detection. See, e.g., 
Gasparini, et ah, 1992. Mol Cell Probes 6: 1 . It is anticipated that in certain embodiments 
amplification may also be performed using Taq ligase for amplification. See, e.g., Barany, 
1 991 . Proc. Natl. Acad. Sci. USA 88: 1 89. In such cases, ligation will occur only if there is a 
perfect match at the 3 -terminus of the 5* sequence, making it possible to detect the presence 
of a known mutation at a specific site by looking for the presence or absence of amplification. 

The methods described herein may be performed, for example, by utilizing 

pre-packaged diagnostic kits comprising at least one probe nucleic acid or antibody reagent 
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described herein, which may be conveniently used, e.g., in clinical settings to diagnose 
patients exhibiting symptoms or family history of a disease or illness involving an NOVX 
gene. 

Furthermore, any cell type or tissue, preferably peripheral blood leukocytes, in which 
NOVX is expressed may be utilized in the prognostic assays described herein. However, any 
biological sample containing nucleated cells may be used, including, for example, buccal 
mucosal cells, 

Pharmacogenomics 

Agents, or modulators that have a stimulatory or inhibitory effect on NOVX activity 
(e.g., NOVX gene expression), as identified by a screening assay described herein can be 
administered to individuals to treat (prophylactically or therapeutically) disorders (The 
disorders include metabolic disorders, diabetes, obesity, infectious disease, anorexia, cancer- 
associated cachexia, cancer, neurodegenerative disorders, Alzheimer's Disease, Paricinson's 
Disorder, immune disorders, and hematopoietic disorders, and the various dyslipidemias, 
metabolic disturbances associated with obesity, the metabolic syndrome X and wasting 
disorders associated with chronic diseases and various cancers.) In conjunction with such 
treatment, the pharmacogenomics (z.e., the study of the relationship between an individuaFs 
genotype and that individual's response to a foreign compound or drug) of the individual may 
be considered. Differences in metabolism of therapeutics can lead to severe toxicity or 
therapeutic failure by altering the relation between dose and blood concentration of the 
pharmacologically active drug. Thus, the pharmacogenomics of the individual permits the 
selection of effective agents (e.g., drugs) for prophylactic or therapeutic treatments based on a 
consideration of the individual's genotype. Such pharmacogenomics can further be used to 
determine appropriate dosages and therapeutic regimens. Accordingly, the activity of NOVX 
protein, expression of NOVX nucleic acid, or mutation content of NOVX genes in an 
individual can be determined to thereby select appropriate agent(s) for therapeutic or 
prophylactic treatment of the individual. 

Pharmacogenomics deals with clinically significant hereditary variations in the 
response to drugs due to altered drug disposition and abnormal action in affected persons. 
See e.g., Eichelbaum, 1996. Clin, Exp, Pharmacol. Physiol, 23: 983-985; Linder, 1997. Clin. 
Chem.^ 43: 254-266. In general, two types of pharmacogenetic conditions can be 
differentiated. Genetic conditions transmitted as a single factor altering the way drugs act on 
the body (altered drug action) or genetic conditions transmitted as single factors altering the 
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way the body acts on drugs (altered drug metabolism). These pharmacogenetic conditions 
can occur either as rare defects or as polymorphisms. For example, glucose-6-phosphate 
dehydrogenase (G6PD) deficiency is a common inherited enzymopathy in which the main 
clinical complication is hemolysis after ingestion of oxidant drugs (anti-malarials, 
5 sulfonamides, analgesics, nitrofurans) and consumption of fava beans. 

As an illustrative embodiment, the activity of drug metabolizing enzymes is a major 
♦ determinant of both the intensity and duration of drug action. The discovery of genetic 

polymorphisms of drug metabolizing enzymes (e.g., N-acetyltransferase 2 (NAT 2) and 
cytochrome P450 enzymes CYP2D6 and CYP2C19) has provided an explanation as to why 
10 some patients do not obtain the expected drug effects or show exaggerated drug response and 

C3 

^ J serious toxicity after taking the standard and safe dose of a drug. These polymorphisms are 

T expressed in two phenotypes in the population, the extensive metabolizer (EM) and poor 

'M metabolizer (PM). The prevalence of PM is different among different populations. For 

example, the gene coding for CYP2D6 is highly polymorphic and several mutations have 
O 15 been identified in PM, which all lead to the absence of functional CYP2D6. Poor 

p metabolizers of CYP2D6 and CYP2C1 9 quite frequently experience exaggerated drug 

response and side effects when they receive standard doses. If a metabolite is the active 
P I therapeutic moiety, PM show no therapeutic response, as demonstrated for the analgesic 

effect of codeine mediated by its CYP2D6-formed metabolite morphine. At the other 
20 extreme are the so called ultra-rapid metabolizers who do not respond to standard doses. 

Recently, the molecular basis of ultra-rapid metabolism has been identified to be due to 

GYP2D6 gene amplification. 

Thus, the activity of NOVX protein, expression of NOVX nucleic acid, or mutation 

content of NOVX genes in an individual can be determined to thereby select appropriate 
25 agent(s) for therapeutic or prophylactic treatment of the individual. In addition, 

pharmacogenetic studies can be used to apply genotyping of polymorphic alleles encoding 

drug-metabolizing enzymes to the identification of an individual's drug responsiveness 

phenotype. This knowledge, when applied to dosing or drug selection, can avoid adverse 

reactions or therapeutic failure and thus enhance therapeutic or prophylactic efficiency when 
30 treating a subject with an NOVX modulator, such as a modulator identified by one of the 

exemplary screening assays described herein. 
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Monitoring of Effects During Clinical Trials 

Monitoring the influence of agents (e.g., drugs, compounds) on the expression or 
activity of NOVX (e.g., the ability to modulate aberrant cell proliferation and/or 
differentiation) can be applied not only in basic drug screening, but also in clinical trials. For 
example, the effectiveness of an agent determined by a screening assay as described herein to 
increase NOVX gene expression, protein levels, or upregulate NOVX activity, can be 
monitored in clinical trails of subjects exhibiting decreased NOVX gene expression, protein 
levels, or downregulated NOVX activity. Alternatively, the effectiveness of an agent 
determined by a screening assay to decrease NOVX gene expression, protein levels, or 
downregulate NOVX activity, can be monitored in clinical trails of subjects exhibiting 
increased NOVX gene expression, protein levels, or upregulated NOVX activity. In such 
clinical trials, the expression or activity of NOVX and, preferably, other genes that have been 
implicated in, for example, a cellular proliferation or immune disorder can be used as a "read 
out" or markers of the immune responsiveness of a particular cell. 

By way of example, and not of limitation, genes, including NOVX, that are 
modulated in cells by treatment with an agent (e.g., compound, drug or small molecule) that 
modulates NOVX activity (e.g., identified in a screening assay as described herein) can be 
identified. Thus, to study the effect of agents on cellular proliferation disorders, for example, 
in a clinical trial, cells can be isolated and RNA prepared and analyzed for the levels of 
expression of NOVX and other genes implicated m the disorder. The levels of gene 
expression (/.e., a gene expression pattern) can be quantified by Northern blot analysis or 
RT-PCR, as described herein, or alternatively by measuring the amount of protein produced, 
by one of the methods as described herein, or by measuring the levels of activity of NOVX or 
other genes. In this manner, the gene expression pattern can serve as a marker, indicative of 
the physiological response of the cells to the agent. Accordingly, this response state may be 
determined before, and at various points during, treatment of the individual with the agent. 

In one embodiment, the invention provides a method for monitoring the effectiveness 
of treatment of a subject with an agent (e.g., an agonist, antagonist, protein, peptide, 
peptidomimetic, nucleic acid, small molecule, or other drug candidate identified by the 
screening assays described herein) comprising the steps of (?) obtaining a pre-administration 
sample from a subject prior to administration of the agent; (//) detecting the level of 
expression of an NOVX protein, mRNA, or genomic DNA in the preadministration sample; 
(Hi) obtaining one or more post-administration samples from the subject; (fv) detecting the 
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level of expression or activity of the NOVX protein, mRNA, or genomic DNA in the 
post-administration samples; (v) comparing the level of expression or activity of the NOVX 
protein, mRNA, or genomic DNA in the pre-administration sample with the NOVX protein, 
mRNA, or genomic DNA in the post administration sample or samples; and (v/) altering the 
administration of the agent to the subject accordingly. For example, increased administration 
of the agent may be desirable to increase the expression or activity of NOVX to higher levels 
than detected, /.e., to increase the effectiveness of the agent. Alternatively, decreased 
administration of the agent may be desirable to decrease expression or activity of NOVX to 
lower levels than detected, /.e., to decrease the effectiveness of the agent. 

Methods of Treatment 

The invention provides for both prophylactic and therapeutic methods of treating a 
subject at risk of (or susceptible to) a disorder or having a disorder associated with aberrant 
NOVX expression or activity. The disorders include cardiomyopathy, atherosclerosis, 
hypertension, congenital heart defects, aortic stenosis, atrial septal defect (ASD), 
atrioventricular (A-V) canal defect, ductus arteriosus, pulmonary stenosis, subaortic stenosis, 
ventricular septal defect (VSD), valve diseases, tuberous sclerosis, scleroderma, obesity, 
transplantation, adrenoleukodystrophy, congenital adrenal hyperplasia, prostate cancer, 
neoplasm; adenocarcinoma, lymphoma, uterus cancer, fertility, hemophilia, 
hypercoagulation, idiopathic thrombocytopenic purpura, immunodeficiencies, graft versus 
host disease, AIDS, bronchial asthma, Crohn's disease; multiple sclerosis, treatment of 
Albright Hereditary Ostoeodystrophy, and other diseases, disorders and conditions of the like. 

These methods of treatment will be discussed more fully below. 

Disease and Disorders 

Diseases and disorders that are characterized by increased (relative to a subject not 

suffering from the disease or disorder) levels or biological activity may be treated with 

Therapeutics that antagonize (i.e., reduce or inhibit) activity. Therapeutics that antagonize 

activity may be administered in a therapeutic or prophylactic manner. TTierapeutics that may 

be utilized include, but are not limited to: (f) an aforementioned peptide, or analogs, 

derivatives, fragments or homologs thereof; (/z) antibodies to an aforementioned peptide; (///) 

nucleic acids encoding an aforementioned peptide; (iv) administration of antisense nucleic 

acid and nucleic acids that are "dysfunctional" {Le,, due to a heterologous insertion within the 

coding sequences of coding sequences to an aforementioned peptide) that are utilized to 

"knockout" endogenous function of an aforementioned peptide by homologous recombination 
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{see, e.g., Capecchi, 1989, Science 244: 1288-1292); or (v) modulators ( /.e., inhibitors, 
agonists and antagonists, including additional peptide mimetic of the invention or antibodies 
speciiRc to a peptide of the invention) that alter the interaction between an aforementioned 
peptide and its binding partner. 

Diseases and disorders that are characterized by decreased (relative to a subject not 
suffering from the disease or disorder) levels or biological activity may be treated with 
Therapeutics that increase (ie.y are agonists to) activity. Therapeutics that upregulate activity 
may be administered in a therapeutic or prophylactic manner. Therapeutics that may be 
utilized include, but are not limited to, an aforementioned peptide, or analogs, derivatives, 
fragments or homologs thereof; or an agonist that increases bioavailability. 

Increased or decreased levels can be readily detected by quantifying peptide and/or 
RNA, by obtaining a patient tissue sample (e.g., from biopsy tissue) and assaying it in vitro 
for RNA or peptide levels, structure and/or activity of the expressed peptides (or mRNAs of 
an aforementioned peptide). Methods that are well-known within the art include, but are not 
limited to, immunoassays (e.g., by Western blot analysis, immunoprecipitation followed by 
sodium dodecyl sulfate (SDS) polyacrylamide gel electrophoresis, immunocytochemistry, 
etc.) and/or hybridization assays to detect expression of mRNAs (e.g.. Northern assays, dot 
blots, in situ hybridization, and the like). 

Prophylactic Methods 

In one aspect, the invention provides a method for preventing, in a subject, a disease 
or condition associated with an aberrant NOVX expression or activity, by administering to 
the subject an agent that modulates NOVX expression or at least one NOVX activity. 
Subjects at risk for a disease that is caused or contributed to by aberrant NOVX expression or 
activity can be identified by, for example, any or a combination of diagnostic or prognostic 
assays as described herein. Administration of a prophylactic agent can occur prior to the 
manifestation of symptoms characteristic of the NOVX aberrancy, such that a disease or 
disorder is prevented or, alternatively, delayed in its progression* Depending upon the type 
of NOVX aberrancy, for example, an NOVX agonist or NOVX antagonist agent can be used 
for treating the subject. The appropriate agent can be determined based on screening assays 
described herein. The prophylactic methods of the invention are further discussed in the 
following subsections. 
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Therapeutic Methods 

Another aspect of the invention pertains to methods of modulating NOVX expression 
or activity for therapeutic purposes. The modulatory method of the invention involves 
contacting a cell with an agent that modulates one or more of the activities of NOVX protein 
5 activity associated with the cell. An agent that modulates NOVX protein activity can be an 
agent as described herein, such as a nucleic acid or a protein, a naturally-occurring cognate 
ligand of an NOVX protein, a peptide, an NOVX peptidomimetic, or other small molecule. 
In one embodiment, the agent stimulates one or more NOVX protein activity. Examples of 
such stimulatory agents include active NOVX protein and a nucleic acid molecule encoding 
10 NOVX that has been introduced into the cell. In another embodiment, the agent inhibits one 
or more NOVX protein activity. Examples of such inhibitory agents include antisense 
VM NOVX nucleic acid molecules and anti-NOVX antibodies. These modulatory methods can 

be performed in vitro (e.g., by culturing the cell with the agent) or, alternatively, in vivo (e,g., 
by administering the agent to a subject). As such, the invention provides methods of treating 
15 an individual afflicted with a disease or disorder characterized by aberrant expression or 
activity of an NOVX protein or nucleic acid molecule. In one embodiment, the method 
involves administering an agent (e.g,, an agent identified by a screening assay described 
herein), or combination of agents that modulates (e.g,, up-regulates or down-regulates) 
NOVX expression or activity. In another embodiment, the method involves administering an 
20 NOVX protein or nucleic acid molecule as therapy to compensate for reduced or aberrant 
NOVX expression or activity. 

Stimulation of NOVX activity is desirable in situations in which NOVX is abnormally 
downregulated and/or in which increased NOVX activity is likely to have a beneficial effect 
One example of such a situation is where a subject has a disorder characterized by aberrant 
25 cell proliferation and/or differentiation (e.g.^ cancer or immune associated disorders). 
Another example of such a situation is where the subject has a gestational disease (e.g., 
preclampsia). 

Determination of the Biological Effect of the Therapeutic 

In various embodiments of the invention, suitable in vitro or in vivo assays are 
30 performed to determine the effect of a specific Therapeutic and whether its administration is 
indicated for treatment of the affected tissue. 

In various specific embodiments, in vitro assays may be performed with 
representative cells of the type(s) involved in the patient's disorder, to determine if a given 
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Therapeutic exerts the desired effect upon the cell type(s). Compounds for use in therapy 
may be tested in suitable animal model systems including, but not limited to rats, mice, 
chicken, cows, monkeys, rabbits, and the like, prior to testing in human subjects. Similarly, 
for in vivo testing, any of the animal model system known in the art may be used prior to 
administration to human subjects. 

Prophylactic and Therapeutic Uses of the Compositions of the Invention 

The NOVX nucleic acids and proteins of the invention are useful in potential 
prophylactic and therapeutic applications implicated in a variety of disorders including, but 
not limited to: metabolic disorders, diabetes, obesity, infectious disease, anorexia, cancer- 
associated cancer, neurodegenerative disorders, Alzheimer's Disease, Parkinson's Disorder, 
immune disorders, hematopoietic disorders, and the various dyslipidemias, metabolic 
disturbances associated with obesity, the metabolic syndrome X and wasting disorders 
associated with chronic diseases and various cancers. 

As an example, a cDNA encoding the NOVX protein of the invention may be useful 
in gene therapy, and the protein may be useful when administered to a subject in need 
thereof By way of non-limiting example, the compositions of the invention will have 
efficacy for treatment of patients suffering from: metabolic disorders, diabetes, obesity, 
infectious disease, anorexia, cancer-associated cachexia, cancer, neurodegenerative disorders, 
Alzheimer^s Disease, Parkinson's Disorder, immune disorders, hematopoietic disorders, and 
the various dyslipidemias. 

Both the novel nucleic acid encoding the NOVX protein, and the NOVX protein of 
the invention, or fragments thereof, may also be useful in diagnostic applications, wherein the 
presence or amount of the nucleic acid or the protein are to be assessed, A further use could 
be as an anti-bacterial molecule (/,e., some peptides have been found to possess anti-bacterial 
properties). These materials are further useful in Ae generation of antibodies, which 
immunospecifically-bind to the novel substances of the invention for use in therapeutic or 
diagnostic methods. 

General Screeening and Diagnostic Methods 

Several of the herein disclosed methods relate to comparmg the levels of expression 
of angiopoietin related protein (ARP) nucleic acids or polypetides in a test and reference cell 
populations. The sequence information disclosed herein, coupled with nucleic acid detection 
methods known in the art, allow for detection and comparison of the ARP transcripts. 
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In its various aspects and embodiments, the invention includes providing a test cell 
population which includes at least one cell that is capable of expressing ARP. By "capable of 
expressing'' is meant that the gene is present in an intact form in the cell and is expressed 
under particular conditions. Using sequence information provided by the database entries for 
the ARP sequences, ARP sequences can be detected (if present) and measured using 
techniques well known to one of ordinary skill in the art. For example, sequences within the 
sequence database entries corresponding to ARP, or within the sequences disclosed herein, 
can be used to construct probes for detecting ARP RNA sequences in, e.g., northern blot 
hybridization analyses or methods which specifically, and, preferably, quantitatively amplify 
specific nucleic acid sequences. As another example, the sequences can be used to construct 
primers for specifically amplifying the ARP sequences in, e,g., amplification-based detection 
methods such as reverse-transcription based polymerase chain reaction. When alterations in 
gene expression are associated with gene amplification or deletion, sequence comparisons in 
test and reference populations can be made by comparing relative amounts of the examined 
DNA sequences in the test and reference cell populations. 

For ARP sequences whose polypeptide product is known, expression can be also 
measured at the protein level, i.e., by measuring the levels of polypeptides encoded by the 
gene products described herein. Such methods are well known in the art and include, e.g., 
immunoassays based on antibodies to proteins encoded by the genes. 

Expression level of the ARP sequences in the test cell population is then compared to 
expression levels of the ARP in one or more cells from a reference cell population. 
Expression of sequences in test and control populations of cells can be compared using any 
art-recognized method for comparing expression of nucleic acid sequences. For example, 
expression can be compared using GENECALLING® methods as described in US Patent No. 
5,871,697 and in Shimkets et al., Nat. Biotechnol. 17:798-803. 

In various embodiments, the expression of ARP are measured. If desired, expression 
of these sequences can be measured along with other sequences whose expression is known 
to be altered according to one of the herein described parameters or conditions. 

The reference cell population includes one or more cells capable of expressing the 

measured ARP sequences and for which the compared parameter is known, e.g., exposed to a 

test agent, disease status or PPARy expression status. By ''disease status" is meant is known 

whether the reference cell has the disease state being screened (e.g., renal disorders such as 

clear cell renal carcinoma, kidney cancer, renal dyplasia, or inflammatory disorders such as 

allergy, asthma, emphysema. By "PPARy expression status" is meant that is known whether 
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the reference cell has had contact with a PPARy ligand, e,g, N-(2-benzoylphenyl)-L-'tyrosine. 
Whether or not comparison of the gene expression profile in the test cell population to the 
reference cell population reveals the presence, or degree, of the measured parameter depends 
on the composition of the reference cell population. For example, if the reference cell 
population is composed of cells that have not been treated with a known PPARy ligand, a 
similar gene expression level in the test cell population and a reference cell population 
indicates the test agent is not a PPARy ligand. Conversely, if the reference cell population is 
made up of cells that have been treated with a known PPAR y ligand , a similar gene 
expression profile between the test cell population and the reference cell population indicates 
the test agent is a PPARy ligand. 

In various embodiments, a ARP sequence in a test cell population is considered 
comparable in expression level to the expression level of the ARP sequence if its expression 
level varies within a factor of 2.0, 1 .5, or 1 .0 fold to the level of the ARP transcript in the 
reference cell population. In various embodiments, a ARP sequence in a test cell population 
can be considered altered in levels of expression if its expression level varies from the 
reference cell population by more than 1 .0, 1 .5, 2.0 or more fold from the expression level of 
the corresponding ARP sequence in the reference cell population. 

If desired, comparison of differentially expressed sequences between a test cell 
population and a reference cell population can be done with respect to a control nucleic acid 
whose expression is independent of the parameter or condition being measured. Expression 
levels of the control nucleic acid in the test and reference nucleic acid can be used to 
normalize signal levels in the compared populations. Suitable control nucleic acids can 
readily be determined by one of ordinary skill in the art. 

In some embodiments, the test cell population is compared to multiple reference cell 
populations. Each of the multiple reference populations may differ in the known parameter. 
For example, a test cell population may be compared to a first reference cell population 
knovw) to have been exposed to a PPARy ligand, as well as a second reference population 
known have not been exposed to a PPARy ligand. 

The test cell population that is exposed to, z.e., contacted with, the test ligand can be 
any number of cells, i.e., one or more cells, and can be provided in vitro, in vzVa, or ex vivo. 

In other embodiments, the test cell population can be divided into two or more 
subpopulations. The subpopulations can be created by dividing the first population of cells to 
create as identical a subpopulation as possible. This will be suitable, in, for example, in vitro 
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or ex vivo screening methods. In some embodiments, various sub populations can be exposed 
to a control agent, and/or a test agent, multiple test agents, or, e.g.y varying dosages of one or 
multiple test agents administered together, or in various combinations. 

Preferably, cells in the reference cell population are derived from a tissue type as 
similar as possible to test cell, e.g,, adipose tissue or liver tissue. In some embodiments, the 
control cell is derived from the same subject as the test cell, e.g., from a region proximal to 
the region of origin of the test cell. In other embodiments, the reference cell population is 
derived from a plurality of cells. For example, the reference cell population can be a database 
of expression patterns from previously tested cells for which one of the herein-described 
parameters or condhions (e.g., PPARy status, screening, diagnostic, or therapeutic claims) is 
known. 

The subject is preferably a mammal. The mammal can be, e.g., a human, non-human 
primate, mouse, rat, dog, cat, horse, or cow. 

Screening for PPARy ligands 

In one aspect, the invention provides a method of identifying PPARy ligands. The 
PPARy ligand can be identified by providing a cell population that includes cells capable of 
angiopoietin related protein (ARP). The sequences need not be identical to sequences 
including ARP, as long as the sequence is sufficiently similar that specific hybridization can 
be detected. Preferably, the cell includes sequences that are identical, or nearly identical to 
those identifying the ARP nucleic acid or polypeptide 

Expression of the nucleic acid sequences in the test cell population is then compared 
to the expression of the nucleic acid sequences in a reference cell population, which is a cell 
population that has not been exposed to the test agent, or, in some embodiments, a cell 
population exposed to the test agent. Comparison can be performed on test and reference 
samples measured concurrently or at temporally distinct times. An example of the latter is 
the use of compiled expression information, e.g., a sequence database, which assembles 
information about expression levels of known sequences following administration of various 
agents. For example, alteration of expression levels following administration of test agent 
can be compared to the expression changes observed in the nucleic acid sequences following 
administration of a control agent, such as N-{2-benzoylpheny!)-L-tyrosine. 

Finding an alteration (e.g. increase) in the level of expression of the nucleic acid 
sequence in the test cell population compared to the expression of the nucleic acid sequence 
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in the reference cell population that has not been exposed to the test agent indicates the test 
agent is a PPARy ligand. 

The invention also includes a PPARy ligand identified according to this screening 
method, and a pharmaceutical composition comprising the PPARy ligands so identified. 

Screening assays for identifying a candidate therapeutic agent for treating 

OR preventing a pathophysiologies associated with the PPARy MEDIATED 

PA-rawAY 

The differentially expressed sequences disclosed herein can also be used to identify 
candidate therapeutic agents pathophysiologies associated with the PPARy mediated 
pathway. The method is based on screening a candidate therapeutic agent to determine if it 
converts an expression profile of ARP protein or nucleic acid characteristic of a PPARy 
response. 

In the method a cell is exposed to a test agent or a combination of test agents 
(sequentially or simultaneously) and the expression ARP is measured. The expression of the 
ARP in the test population is compared to expression level of the ARP in a reference cell 
population whose PPARy status is known. If the reference cell population contains cells that 
have not been exposed to a PPARy ligand, alteration of the extent of the nucleic acids in the 
test cell population as compared to the reference cell population indicates that the test agent is 
a candidate therapeutic agent. 

In some embodiments, the reference cell population includes cells that have been 
exposed to a test agent When this cell population is used, an alteration in expression of the 
nucleic acid sequences in the presence of the agent from the expression profile of the cell 
population in the absence of the agent indicates the agent is a candidate therapeutic agent. In 
other embodiments the test cell population includes cells that have not been exposed to a 
PPARy ligand. For this cell population, a similarity in expression of ARP in the test and 
control cell populations indicates the test agent is not a candidate therapeutic agent, while a 
difference suggests it is a candidate. 

The test agent can be a compound not previously described or can be a previously 
known compoimd but which is not known to be a PPARy ligand 

An agent effective in stimulating expression of underexpressed genes, or in 
suppressing expression of overexpressed genes can be further tested for its ability to prevent 
the PPARy mediated pathophysiology, e.g. NIDDM, and as a potential therapeutic useful for 
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the treatment of such pathophysiology. Further evaluation of the clinical usefulness of such a 
compound can be performed using standard methods of evaluating toxicity and clinical 
effectiveness of anti-diabetic agents. 



Selecting a therapeutic agent for treating a pathophysiology associated 
5 WITH THE PPARy mediated pathway that is appropriate for a particular 

INDIVIDUAL 

Differences in the genetic makeup of individuals can result in differences in their 
relative abilities to metabolize various drugs. An agent that is metabolized in a subject to act 
as an PPARy ligand can manifest itself by inducing a change in gene expression pattern in the 
1 0 subject's cells from tiiat characteristic of a pathophysiologic state to a gene expression pattern 
characteristic of a non-pathophysiologic state. Accordingly, the differentially expressed ARP 
allow for a putative therapeutic or prophylactic agent to be tested in a test cell population 
S3 from a selected subject in order to determine if the agent is a suitable PPARy ligand in the 

subject. 

flJ 15 To identify a PPARy ligand^ that is appropriate for a specific subject, a test cell 

h I population from the subject is exposed to a therapeutic agent, and the expression ARP is 

O measured. 

In some embodiments, the test cell population contains a adipocyte. In other 
embodiments, the agent is first mixed with a cell extract, e.g., an liver cell extract or an 
20 adipose cell extract, which contains enzymes that metabolize drugs into an active form. The 
activated form of the therapeutic agent can then be mixed with the test cell population and 
gene expression measured. Preferably, the cell population is contacted ex vivo with the agent 
or activated form of the agent. 

Expression of the nucleic acid sequences in the test cell population is then compared 
25 to the expression of the nucleic acid sequences a reference cell population. The reference cell 
population includes at least one cell whose PPARy status is known. If the reference cell had 
been exposed to a PPARy ligand a similar gene expression profile between the test cell 
population and the reference cell population indicates the agent is suitable for treating the 
pathophysiology in the subject. A difference in expression between sequences in the test cell 
30 population and those in the reference cell population indicates that the agent is not suitable 
for treating the PPARy pathophysiology in the subject. 
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If the reference cell has not been exposed to a PPARy ligand, a similarity in gene 
expression patterns between the test cell population and the reference cell population 
indicates the agent is not suitable for treating the PPARy pathophysiology in the subject, 
while a dissimilar gene expression patterns indicate the agent will be suitable for treating the 
subject. 

In some embodiments, a decrease in expression ARP or an increase in expression of 
one or more of ARP in a test cell population relative to a reference cell population is 
indicative that the agent is therapeutic. 

The test agent can be any compound or composition. In some embodiments the test 
agents are compounds and composition know to be PPARy ligands, e,g, 
N-(2-benzoylphenyl)-L-tyrosine. 

Screening for Therapeutic Agents 

In one aspect, the invention provides a method screening for therapeutic agents. By 
"therapeutic agent" is meant an agent that promotes a therapeutic effects such as a 
chemotherapeutic compound. Preferably, the agent promotes insulin sensitivity. More 
preferably the agent inhibits ARP expression or activity. The therapeutic agent can be 
identified by providing a cell population that includes cells capable of expressing ARP. 

Expression of the nucleic acid sequences in the test cell population is then compared 
to the expression of the nucleic acid sequences in a reference cell population, which is a cell 
population that has not been exposed to the test agent, or, in some embodiments, a cell 
population exposed the test agent. Comparison can be performed on test and reference 
samples measured concurrently or at temporally distinct times. An example of the latter is 
the use of compiled expression information, e.g., a sequence database, which assembles 
information about expression levels of known sequences following administration of various 
agents. For example, alteration of expression levels following administration of test agent 
can be compared to the expression changes observed in the nucleic acid sequences following 
administration of a control agent, parathyroid hormone 

An alteration in expression of the nucleic acid sequence in the test cell population 
compared to the expression of the nucleic acid sequence in the reference cell population that 
has not been exposed to the test agent indicates the test agent is an therapeutic agent. 

The invention also includes the therapeutic agent identified according to this 
screening method, and a pharmaceutical composition which includes the therapeutic agent. 
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Methods of diagnosing or Determining the Susceptibility to clear cell renal 
carcinoma in a subject 

The invention further provides a method of diagnosing a clear cell renal carcinoma, in 
a subject. A disorder is diagnosed by examining the expression of ARP from a test population 
of cells from a subject suspected of have the disorder. 

Expression of ARP measured in the test cell and compared to the expression of the 
sequences in the reference cell population. The reference cell population contains at least one 
cell whose disease status (z.e., the reference cell population is from a subject suffering from a 
clear cell renal carcinoma) is known. If the reference cell population contains cells that have 
not suffering from a clear cell renal carcinoma, then a similarity in expression between ARP 
sequences in the test population and the reference cell population indicates the subject does 
not have a bone disorder. A difference (e.g., increase) in expression between ARP in the test 
population and the reference cell population indicates the reference cell population has clear 
cell renal carcinoma 

Conversely, when the reference cell population contains cells that have clear cell renal 
carcinoma, a similarity in expression pattern between the test cell population and the 
reference cell population indicates the test cell population has clear cell renal carcinoma. A 
difference in expression between ARP sequences in the test population and the reference cell 
population indicates the subject does not have a clear cell renal carcinoma. 

Methods of treating Renal Disorders in a subject 

Also included in the invention is a method of treating, /.e., preventing or delaying the 
onset of a renal disorder in a subject by administering to the subject an agent which 
modulates the expression or activity of ARP "Modulates" is meant to include mcrease or 
decrease expression or activity of the ARP polypeptides or nucleic acids. Preferably, 
modulation results in alteration alter the expression or activity of the ARP genes or gene 
products in a subject to a level similar or identical to a subject not suffering from the bone 
disorder. 

The renal disorder can be any of the pathophysiologies described herein, e,g,, kidney 
cancer (z.e., renal cell carcinoma or wilms tumor) , polycystic kidney disease, renal dysplasis, 
kidney degenerative disease (/.e., chronic kidney failure). 

In its various aspects and embodiments, the invention includes administering to a 
subject or contacting a cell with a compound that decrease ARP expression or activity. The 
compound can be, e.g., (i) an antibody or biologically active fragment thereof that 
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specifically binds ARP; (ii) an anti-sense ARP nucleic acid; (iii) a ribozyme that specifically 
targets ARP (iv) a nucleic acid that decrease the expression of a nucleic acid that encodes an 
ARP polypeptide, and derivatives, fragments, analogs and homologs thereof and (v) small 
molecule ARP antagonists. 

The antibody can be for example, monoclonal, polyclonal, humanized, radiolabled, or 
bispecific. The nucleic acid can be either endogenous or exogenous. 

As used herein, the term "nucleic acid is intended to include DNA molecules (e.g., 
cDNA or genomic DNA), RNA molecules (e.g., mRNA), analogs of the DNA or RNA 
generated using nucleotide analogs, and derivatives, fragments and homologs thereof The 
nucleic acid molecule can be single-stranded or double-stranded. The nucleic acid can be 
either endogenous or exogenous. Preferably, the nucleic acid is a DNA. 

The compound can be administered to the subject either directly (z\e., the subject is 
directly exposed to the nucleic acid or nucleic acid-containing vector) or indirectly (i.e., cells 
are first transformed with the nucleic acid in vitro, then transplanted into the subject). For 
example, in one embodiment mammalian cells are isolated from a subject and the ARP anti- 
sense nucleic acid is introduced into the isolated cells in vitro. The cells are reintroduced into 
a suitable mammalian subject. Preferably, the cell is introduced into an autologous subject. 
The routes of administration of the compound can include e.g., parenteral., intravenous, 
intradermal, subcutaneous, oral (e.g., inhalation), transdermal (topical), transmucosal, and 
rectal administration. In one embodiment the compound is administered intravenous. 

The subject can be, e.g., a human, a rodent such as a mouse or rat, or a dog or cat. 

ASSESSING EFFICACY OF TREATMENT OF A KIDNEY DISORDER IN A SUBJECT 

The diflFerentially expressed ARP identified herein also allow for the course of 
treatment of a kidney disorder to be monitored. In this method, a test cell population is 
provided fi-om a subject undergoing treatment for the kidney disorder. If desired, test cell 
populations can be taken jfrom the subject at various time points before, during, or after 
treatment. Expression of ARP in the cell population is then measured and compared to a 
reference cell population which includes cells whose pathophysiologic state is known. 
Preferably, the reference cells not been exposed to the treatment. 

If the reference cell population contains no cells having the pathophysiologic state, 
i.e., kidney disorder, a similarity in expression between ARP in the test cell population and 
the reference cell population indicates that the treatment is efficacious. However, a 
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difference in expression between ARP in the test population and this reference cell 
population indicates the treatment is not efficacious. 

If the reference cell population contains no cells exposed to the treatment, a similarity 
in expression between ARP in the test cell population and the reference cell population 
indicates that the treatment is efficacious. However, a difference in expression between ARP 
in the test population and this reference cell population indicates the treatment is not 
efficacious. 

By "efficacious" is memt that the treatment leads to a decrease in the 
pathophysiology in a subject. When treatment is applied prophylactically, "efficacious" 
means that the treatment retards or prevents a pathophysiology. 

Efficaciousness can be determined in association with any known method for treating 
the particular pathophysiology. 

Methods of diagnosing or Determining the Susceptibility to an Inflammatory 
Disorder 

The invention further provides a method of diagnosing an inflammatory disorder, in a 
subject. A disorder is diagnosed by examining the expression of ARP from a test population 
of cells from a subject suspected of have the disorder. An inflammatory disorder includes for 
example disorders of the pulmonary system, asthma, allergy, emphysema, arthritis (e.g. 
osteoarthritis), chronic obstructive pulmonary disease, or Crohn's disease 

Expression of ARP measured in the test cell and compared to the expression of the 
sequences in the reference cell population. The reference cell population contains at least one 
cell whose, or disease status {/.e., the reference cell population is from a subject suffering 
from an inflammatory disorder is known. If the reference cell population contains cells that 
have not suffering from an inflammatory disorder, then a similarity in expression between 
ARP sequences in the test population and the reference cell population indicates the subject 
does not have a bone disorder. A difference (e.g,, increase) in expression between ARP in 
the test population and the reference cell population indicates the reference cell population 
has an inflammatory disorder. 

Conversely, when the reference cell population contains cells that have an 
inflammatory disorder, a similarity in expression pattern between the test cell population and 
the reference cell population indicates the test cell population has an inflammatory disorder. 
A difference in expression between ARP sequences in the test population and the reference 
cell population indicates the subject does not have an inflammatory disorder. 
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Methods of treating an inflammatory in a subject 

Also included in the invention is a method of treating, i.e., preventing or delaying the 
onset of an inflammatory disorder in a subject by administering to the subject an agent which 
modulates the expression or activity of ARP "''Modulates'* is meant to include increase or 
decrease expression or activity of the ARP polypeptides or nucleic acids. Preferably, 
modulation results in alteration alter the expression or activity of the ARP genes or gene 
products in a subject to a level similar or identical to a subject not suffering from the bone 
disorder. 

The inflammatory disorder can be any of the pathophysiologies described herein, e.g., 
arthritis, COPD or emphysema . 

In its various aspects and embodiments, the invention includes administering to a 
subject or contacting a cell with a compound that decrease ARP expression or activity. The 
compound can be, e.g., (i) an antibody or biologically active fragment thereof that 
specifically binds ARP; (ii) an anti-sense ARP nucleic acid; (iii) a ribozyme that specifically 
targets ARP (iv) a nucleic acid that decrease the expression of a nucleic acid that encodes an 
ARP polypeptide, and derivatives, fragments, analogs and homologs thereof and (v) small 
molecule ARP antagonists. 

The antibody can be for example, monoclonal, polyclonal, humanized, radiolabled, or 
bispecific. The nucleic acid can be either endogenous or exogenous. 

As used herein, the term "nucleic acid " is intended to include DNA molecules (e.g., 
cDNA or genomic DNA), RNA molecules (e.g., mRNA), analogs of the DNA or RNA 
generated using nucleotide analogs, and derivatives, fragments and homologs thereof. The 
nucleic acid molecule can be single-stranded or double-stranded. The nucleic acid can be 
either endogenous or exogenous. Preferably, the nucleic acid is a DNA, 

The compound can be administered to the subject either directly {i.e., the subject is 
directly exposed to the nucleic acid or nucleic acid-containing vector) or indirectly (i.e., cells 
are first transformed with the nucleic acid in vitro, then transplanted into the subject). For 
example, in one embodiment mammalian cells are isolated from a subject and the ARP anti- 
sense nucleic acid is introduced into the isolated cells in vitro. The cells are reintroduced into 
a suitable mammalian subject. Preferably, the cell is introduced into an autologous subject. 
The routes of administration of the compound can include e.g., parenteral, intravenous, 
intradermal, subcutaneous, oral (e.g., inhalation), transdermal (topical), transmucosal, and 
rectal administration. In one embodiment the compound is administered intravenous. 

The subject can be, e.g., a human, a rodent such as a mouse or rat, or a dog or cat. 
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Assessing efficacy of treatment of an inflammatory disorder in a subject 
The differentially expressed ARP identified herein also allow for the course of 
treatment of an inflammatory disorder to be monitored. In this method, a test cell population 
is provided from a subject undergoing treatment for the an inflammatory disorder. If desired, 
test cell populations can be taken from the subject at various time points before, during, or 
after treatment. Expression of ARP in the cell population is then measured and compared to 
a reference cell population which includes cells whose pathophysiologic state is known. 
Preferably, the reference cells not been exposed to the treatment. 

If the reference cell population contains no cells having the pathophysiologic state, 
te,, an inflammatory disorder, a similarity in expression between ARP in the test cell 
population and the reference cell population indicates that the treatment is efficacious. 
However, a difference in expression between ARP in the test population and this reference 
cell population indicates the treatment is not efficacious. 

If the reference cell population contains no cells exposed to the treatment, a similarity 
in expression between ARP in the test cell population and the reference cell population 
indicates that the treatment is efficacious. However, a difference in expression between ARP 
in the test population and this reference cell population indicates the treatment is not 
efficacious. 

By "efficacious" is meant that the treatment leads to a decrease in the 
pathophysiology in a subject. When treatment is applied prophylactically, "efficacious" 
means that the treatment retards or prevents a pathophysiology. 

Efficaciousness can be determined in association with any known method for treating 
the particular pathophysiology. 
Methods of treating or preventing disorders 

Also included in the invention are methods of treating, i.e., preventing or delaying the 
onset of various disorders in a subject of disorders amenable to treatment with the methods of 
the invention include for example, inflammatory disorders, {e.g.y psoriasis, asthma, allergy, 
emphysema, stroke, ischemia reperfusion injury, encephalitis, meningitis, AIDS related 
dementia or septic shock) cancer (e.g., adenocarcinomas of the colon, squamous cell and 
adenocarcinomas of the lung, clear cell renal cell carcinomas, hepatocellular carcinomas, 
transitional cell carcinomas of the bladder, cystadenocarcinoma and adenocarcinomas of the 
stomach, ovarian tumors, thyroid tumors, gliomas and astrocytomas), CNS trauma (brain and 
spinal cord), peripheral neuropathies and demyelation diseases ( e.g., multiple sclerosis, and 
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cerebral lupus). In various aspects the method includes administering to the subject a 
compound which modulates the 11-8 expression or activity. "Modulates" is meant to include 
increase or decrease 11-8 expression or activity. Preferably, modulation results in alteration of 
the expression or activity of 11-8 in a subject to a level similar or identical to a subject not 
suffering from the disorder. 

In its various aspects and embodiments, the invention includes administering to a 
subject or contacting a cell with a compound that decrease IL-8 expression or activity. The 
compound can be, e.g,, (i) an antibody or biologically active fragment thereof that 
specifically binds IL-8; (ii) an anti-sense IL-8 nucleic acid; (iii) a ribozyme that specifically 
targets IL-8 (iv) a nucleic acid that decreases the expression of a nucleic acid that encodes an 
IL-8 polypeptide, and derivatives, fragments, analogs and homologs thereof and (v) small 
molecule IL-8 antagonists. 

The antibody can be for example, monoclonal, polyclonal, humanized, radiolabled, or 
bispecific. The nucleic acid can be either endogenous or exogenous. 

As used herein, the term "nucleic acid " is intended to include DNA molecules (e.g., 
cDNA or genomic DNA), RNA molecules (e.g., mRNA), analogs of the DNA or RNA 
generated using nucleotide analogs, and derivatives, fragments and homologs thereof The 
nucleic acid molecule can be single-stranded or double-stranded. The nucleic acid can be 
either endogenous or exogenous. Preferably, the nucleic acid is DNA. 

The compound can be administered to the subject either directly (ue,, the subject is 
directly exposed to the nucleic acid or nucleic acid-containing vector) or indirectly (z.e., cells 
are first transformed with the nucleic acid in vitro, then transplanted into the subject). For 
example, in one embodiment mammalian cells are isolated from a subject and an IL-8 anti- 
sense nucleic acid is introduced into the isolated cells in vitro. The cells are reintroduced into 
a suitable mammalian subject. Preferably, the cell is introduced into an autologous subject. 
In some embodiments, the cells may also be cultured ex vivo in the presence of therapeutic 
agents or proteins of the present invention in order to proliferate or to produce a desired effect 
on or activity in such cells. Treated cells can then be introduced in vivo for therapeutic 
purposes. 

The routes of administration of the compound can include e.g., parenteral, 
intravenous, intradermal, subcutaneous, oral (e.g., inhalation), transdermal (topical), 
transmucosal, and rectal administration. In one embodiment the compound is administered 
intravenously. 
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The subject is preferably a mammaL The mammal can be, e.g,, a human, non-human 
primate, mouse, rat, dog, cat, horse, or cow. 

The herein-described IL-8 modulating compound when used therapeutically are 
referred to herein as "Therapeutics". Methods of administration of Therapeutics include, but 
are not limited to, intradermal, intramuscular, intraperitoneal, intravenous, subcutaneous, 
intranasal, epidural, and oral routes. The Therapeutics of the present invention may be 
administered by any convenient route, for example by infusion or bolus injection, by 
absorption through epithelial or mucocutaneous linings (e.g,, oral mucosa, rectal and 
intestinal mucosa, etc.) and may be administered together with other biologically-active 
agents. Administration can be systemic or local. In addition, it may be advantageous to 
administer the Therapeutic into the central nervous system by any suitable route, including 
intraventricular and intrathecal injection. Intraventricular injection may be facilitated by an 
intraventricular catheter attached to a reservoir (e.g., an Ommaya reservoir). Pulmonary 
administration may also be employed by use of an inhaler or nebulizer, and formulation with 
an aerosolizing agent. It may also be desirable to administer the Therapeutic locally to the 
area in need of treatment; this may be achieved by, for example, and not by way of limitation, 
local infusion during surgery, topical application, by injection, by means of a catheter, by 
means of a suppository, or by means of an implant. Various delivery systems are known and 
can be used to administer a Therapeutic of the present invention including, e.g.: (/) 
encapsulation in liposomes, microparticles, microcapsules; (ii) recombinant cells capable of 
expressing the Therapeutic; (in) receptor-mediated endocytosis {See, e.g., Wu and Wu, 1987. 
J Biol Chem 262:4429-4432); (fv) construction of a Therapeutic nucleic acid as part of a 
retroviral, adenoviral or other vector, and the like. In one embodiment of the present 
invention, the Therapeutic may be delivered in a vesicle, in particular a liposome. In a 
liposome, the protein of the present invention is combined, in addition to other 
pharmaceutically acceptable carriers, with amphipathic agents such as lipids which exist in 
aggregated form as micelles, insoluble monolayers, liquid crystals, or lamellar layers in 
aqueous solution. Suitable lipids for liposomal formulation include, without limitation, 
monoglycerides, diglycerides, sulfatides, lysolecithin, phospholipids, saponin, bile acids, and 
the like. Preparation of such liposomal formulations is within the level of skill in the art, as 
disclosed, for example, in U.S. Pat. No. 4,837,028; and U.S. Pat No. 4,737,323, both of 
which are incorporated herein by reference. In yet another embodiment, the Therapeutic can 
be delivered in a controlled release system including, e.g. a delivery pump {See, e.g., Saudek, 
et t?/., 1989. New EnglJMed 321 :574 and a semi-permeable polymeric material {See, e.g., 
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Howard, et al, 1989. JNeurosurg 71:105). Additionally, the controlled release system can 
be placed in proximity of the therapeutic target {e.g., the brain), thus requiring only a fraction 
of the systemic dose. See^ e.g., Goodson, In: Medical Applications of Controlled Release 
1984. (CRC Press, Bocca Raton, FL). 

In a specific embodiment of the present invention, where the Therapeutic is a nucleic 
acid encoding a protein, the Therapeutic nucleic acid may be administered in vivo to promote 
expression of its encoded protein, by constructing it as part of an appropriate nucleic acid 
expression vector and administering it so that it becomes intracellular, e.g., by use of a 
retroviral vector, by direct injection, by use of microparticle bombardment, by coating with 
lipids or cell-surface receptors or transfecting agents, or by administering it in linkage to a 
homeobox-like peptide which is known to enter the nucleus. See^ e.g., Joliot, et al, 1991 . 
Proc Natl Acad Sci USA 88:1864-1868. Alternatively, a nucleic acid Therapeutic can be 
introduced intracellularly and incorporated within host cell DNA for expression, by 
homologous recombination or it can remain episomal. 

As used herein, the term "therapeutically effective amount" means the total amount of 
each active component of the pharmaceutical composition or method that is sufficient to 
show a meaningful patient benefit, i.e., treatment, healing, prevention or amelioration of the 
relevant medical condition, or an increase in rate of treatment, healing, prevention or 
amelioration of such conditions. When applied to an individual active ingredient, 
administered alone, the term refers to that ingredient alone. When applied to a combination, 
the term refers to combined amounts of the active ingredients that result in the therapeutic 
effect, whether administered in combination, serially or simultaneously. 

The amount of the Therapeutic of the invention which will be effective in the 
treatment of a particular disorder or condition will depend on the nature of the disorder or 
condition, and may be determined by standard clinical techniques by those of average skill 
within the art. In addition, in vitro assays may optionally be employed to help identify 
optimal dosage ranges. The precise dose to be employed in the formulation will also depend 
on the route of administration, and the overall seriousness of the disease or disorder, and 
should be decided according to the judgment of the practitioner and each patient's 
circumstances. Uhimately, the attending physician will decide the amount of Therapeutic of 
the present invention with which to treat each individual patient. Initially, the attending 
physician will administer low doses of Therapeutic of the present invention and observe the 
patienf s response. Larger doses of Therapeutic of the present invention may be administered 
until the optimal therapeutic effect is obtained for the patient, and at that point the dosage is 
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not increased fiirther. However, suitable dosage ranges for intravenous administration of the 
Therapeutics of the present invention are generally about 20-500 micrograms (^ig) of active 
compound per kilogram (Kg) body weight. Suitable dosage ranges for intranasal 
administration are generally about 0.01 pg/kg body weight to 1 mg/kg body weight. 
Effective doses may be extrapolated jfrom dose-response ciirves derived from in vitro or 
animal model test systems. Suppositories generally contain active ingredient in the range of 
0.5% to 10% by weight; oral formulations preferably contain 10% to 95% active ingredient 

The duration of intravenous therapy using the Therapeutic of the present invention 
will vary, depending on the severity of the disease being treated and fte condition and 
potential idiosyncratic response of each individual patient. It is contemplated that the 
duration of each application of the protein of the present invention will be in the range of 12 
to 24 hours of continuous intravenous administration. Ultimately the attending physician will 
decide on the appropriate duration of intravenous therapy using the pharmaceutical 
composition of the present invention. 

The invention will be further described in the following examples, which do not limit 
the scope of the invention described in the claims. 

EXAMPLES 

Example 1 : Identification ofNOVX Nucleic Acids 

TblastN using CuraGen Corporation's sequence file for polypeptides or homologs 
was run against the Genomic Daily Files made available by GenBank or from files 
downloaded from the individual sequencing centers. Exons were predicted by homology and 
the intron/exon boundaries were determined using standard genetic rules. Exons were further 
selected and refined by means of similarity determination using multiple BLAST (for 
example, tBlastN, BlastX, and BlastN) searches^ and, in some instances, GeneScan and Grail. 
Expressed sequences from both public and proprietary databases were also added when 
available to further define and complete the gene sequence. The DNA sequence was then 
manually corrected for apparent inconsistencies thereby obtaining the sequences encoding the 
fiilHength protein. 

The novel NOVX target sequences identified in the present invention were subjected 
to the exon linking process to confirm the sequence. PGR primers were designed by starting 
at the most upstream sequence available, for the forward primer, and at the most downstream 
sequence available for the reverse primer. PGR primer sequences were used for obtaining 
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different clones. In each case, the sequence was examined, walking inward from the 
respective termini toward the coding sequence, until a suitable sequence that is either unique 
or highly selective was encountered, or, in the case of the reverse primer, until the stop codon 
was reached. Such primers were designed based on in silico predictions for the full length 
cDNA, part (one or more exons) of the DNA or protein sequence of the target sequence, or 
by translated homology of the predicted exons to closely related human sequences from other 
species. These primers were then employed in PCR amplification based on the following pool 
of human cDNAs: adrenal gland, bone marrow, brain - amygdala, brain - cerebellum, brain - 
hippocampus, brain - substantia nigra, brain - thalamus, brain -whole, fetal brain, fetal 
kidney, fetal liver, fetal lung, heart, kidney, lymphoma - Raji, mammary gland, pancreas, 
pituitary gland, placenta, prostate, salivary gland, skeletal muscle, small intestine, spinal cord, 
spleen, stomach, testis, thyroid, trachea, uterus. Usually the resulting amplicons were gel 
purified, cloned and sequenced to high redundancy. The PCR product derived from exon 
linking was cloned into the pCR2.1 vector from Invitrogen. The resulting bacterial clone has 
an insert covering the entire open reading frame cloned into the pCR2.1 vector. The resulting 
sequences from all clones were assembled with themselves, with other fragments in CuraGen 
Corporation's database and with public ESTs. Fragments and ESTs were included as 
components for an assembly when the extent of their identity with another component of the 
assembly was at least 95% over 50 bp. In addition, sequence traces were evaluated manually 
and edited for corrections if appropriate. These procedures provide the sequence reported 
herein. 

Physical clone: Exons were predicted by homology and the intron/exon boundaries 
were determined using standard genetic rules. Exons were further selected and refined by 
means of similarity determination using multiple BLAST (for example, tBlastN, BlastX, and 
BlastN) searches, and, in some instances, GeneScan and Grail. Expressed sequences from 
both public and proprietary databases were also added when available to further define and 
complete the gene sequence. The DNA sequence was then manually corrected for apparent 
inconsistencies thereby obtaining the sequences encoding the full-length protein. 

Example 2: Identification of Single Nucleotide Polymorphisms in NOVX 

NUCXEIC AOD SEQUENCES 

Variant sequences are also included in this application. A variant sequence can 
include a single nucleotide polymorphism (SNP). A SNP can, in some instances, be referred 
to as a "cSNP" to denote that the nucleotide sequence containing the SNP originates as a 
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cDNA. A aNfP can arise in several ways. For example, a SNP may be due to a substitution of 
one nucleotide for another at the polymorphic site. Such a substitution can be either a 
transition or a transversion. A SNP can also arise from a deletion of a nucleotide or an 
insertion of a nucleotide, relative to a reference allele. In this case, the polymorphic site is a 
site at which one allele bears a gap with respect to a particular nucleotide in another allele. 
SNPs occurring within genes may result in an alteration of the amino acid encoded by the 
gene at the position of the SNP. Intragenic SNPs may also be silent, when a codon including 
a SNP encodes the same amino acid as a result of the redundancy of the genetic code. SNPs 
occurring outside the region of a gene, or in an intron within a gene, do not result in changes 
in any amino acid sequence of a protein but may result in altered regulation of the expression 
pattern. Examples include alteration in temporal expression, physiological response 
regulation, cell type expression regulation, intensity of expression, and stability of transcribed 
message. 

SeqCalling assemblies produced by the exon linking process were selected and 
extended using the following criteria. Genomic clones having regions with 98% identity to 
all or part of the initial or extended sequence were identified by BLASTN searches using the 
relevant sequence to query human genomic databases. The genomic clones that resulted were 
selected for further analysis because this identity indicates that these clones contain the 
genomic locus for these SeqCalling assemblies. These sequences were analyzed for putative 
coding regions as well as for similarity to the known DNA and protein sequences. Programs 
used for these analyses include Grail, Genscan, BLAST, HMMER, FASTA, Hybrid and other 
relevant programs. 

Some additional genomic regions may have also been identified because selected 
SeqCalling assemblies map to those regions. Such SeqCalling sequences may have 
overiapped with regions defined by homology or exon prediction. They may also be included 
because the location of the fragment was in the vicinity of genomic regions identified by 
similarity or exon prediction that had been included in the original predicted sequence. The 
sequence so identified was manually assembled and then may have been extended using one 
or more additional sequences taken from CuraGen Corporation's human SeqCalling database. 
SeqCalling fragments suitable for inclusion were identified by the CuraTools™ program 
SeqExtend or by identifying SeqCalling fragments mapping to the appropriate regions of the 
genomic clones analyzed. 

The regions defined by the procedures described above were then manually integrated 
and corrected for apparent inconsistencies that may have arisen, for example, from miscalled 
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bases in the original fragments or from discrepancies between predicted exon junctions, EST 
locations and regions of sequence similarity, to derive the final sequence disclosed herein. 
When necessary, the process to identify and analyze SeqCalling assemblies and genomic 
clones was reiterated to derive the full length sequence (Alderbom et aL, Determination of 
Single Nucleotide Polymorphisms by Real-time Pyrophosphate DNA Sequencing. Genome 
Research. 10 (8) 1249-1265, 2000). 

Example 3. Quantitative expression analysis of clones in various cells and 

TISSUES 

The quantitative expression of various clones was assessed using microtiter plates 
containing RNA samples from a variety of normal and pathology-derived cells, cell lines and 
tissues using real time quantitative PGR (RTQ PGR). RTQ PGR was performed on an 
Applied Biosystems ABI PRISM® 7700 or an ABI PRISM® 7900 HT Sequence Detection 
System. Various collections of samples are assembled on the plates, and referred to as Panel 
1 (containing normal tissues and cancer cell lines). Panel 2 (containing samples derived from 
tissues from normal and cancer sources). Panel 3 (containing cancer cell lines). Panel 4 
(containing cells and cell lines from normal tissues and cells related to inflammatory 
conditions). Panel 5D/5I (containing human tissues and cell lines with an emphasis on 
metabolic diseases), AI_comprehensive_paneI (containing normal tissue and samples from 
autoimmune diseases). Panel GNSD.Ol (containing central nervous system samples from 
normal and diseased brains) and GNS_neurodegeneration_panel (containing samples from 
normal and Alzheimer's diseased brains). 

RNA integrity from all samples is controlled for quality by visual assessment of 
agarose gel electropherograms using 28S and 18S ribosomal RNA staining intensity ratio as a 
guide (2:1 to 2.5:1 28s:18s) and the absence of low molecular weight RNAs that would be 
indicative of degradation products. Samples are controlled against genomic DNA 
contamination by RTQ PGR reactions run in the absence of reverse transcriptase using probe 
and primer sets designed to amplify across the span of a single exon. 

First, the RNA samples were normalized to reference nucleic acids such as 
constitutively expressed genes (for example, P-actin and GAPDH). Normalized RNA (5 ul) 
was converted to cDNA and analyzed by RTQ-PGR using One Step RT-PGR Master Mix 
Reagents (Applied Biosystems; Catalog No. 4309169) and gene-specific primers according to 
the manufacturer's instructions. 
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In other cases, non-normalized RNA samples were converted to single strand cDNA 
(sscDNA) using Superscript II (Invitrogen Corporation; Catalog No. 1 8064-147) and random 
hexamers according to the manufacturer's instructions. Reactions containing up to 10 ng of 
total KNA were performed in a volume of 20 ]il and incubated for 60 minutes at 42''C. This 
reaction can be scaled up to 50 jig of total RNA in a final volume of 100 jil. sscDNA samples 
are then normalized to reference nucleic acids as described previously, using IX TaqMan® 
Universal Master mix (Applied Biosystems; catalog No. 4324020), following the 
manufacturer's instructions. 

Probes and primers were designed for each assay according to Applied Biosystems 
Primer Express Software package (version I for Apple Computer's Macintosh Power PC) or a 
similar algorithm using the target sequence as input. Default settings were used for reaction 
conditions and the following parameters were set before selecting primers: primer 
concentration = 250 nM, primer melting temperature (Tm) range = 58*'-60'*C, primer optimal 
jjn = 590(3^ maximum primer difference = 2°C, probe does not have 5'G, probe Tm must be 
1 O^C greater than primer Tm, amplicon size 75bp to 1 OObp. The probes and primers selected 
(see below) were synthesized by Synthegen (Houston, TX, USA). Probes were double 
purified by HPLC to remove uncoupled dye and evaluated by mass spectroscopy to verify 
coupling of reporter and quencher dyes to the 5' and 3' ends of the probe, respectively. Their 
final concentrations were: forward and reverse primers, 900nM each, and probe, 200nM. 

PCR conditions: When working with RNA samples, normalized RNA from each 
tissue and each cell line was spotted in each well of either a 96 well or a 384-well PCR plate 
(Applied Biosystems). PCR cocktails included either a single gene specific probe and primers 
set, or two multiplexed probe and primers sets (a set specific for the target clone and another 
gene-specific set multiplexed with the target probe). PCR reactions were set up using 
TaqMan® One-Step RT-PCR Master Mix (Applied Biosystems, Catalog No. 4313803) 
following manufacturer's instructions. Reverse transcription was performed at 48X for 30 
minutes followed by amplification/PCR cycles as follows: 95'^C 10 min, then 40 cycles of 
95X for 15 seconds, 60*'C for 1 minute. Results were recorded as CT values (cycle at which a 
given sample crosses a threshold level of fluorescence) using a log scale, with the difference 
in RNA concentration between a given sample and the sample with the lowest CT value 
being represented as 2 to the power of delta CT. The percent relative expression is then 
obtained by taking the reciprocal of this RNA difference and multiplying by 1 00. 

When working with sscDNA samples, normalized sscDNA was used as described 
previously for RNA samples. PCR reactions containing one or two sets of probe and primers 
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were set up as described previously, using IX TaqMan® Universal Master mix (Applied 
Biosystems; catalog No. 4324020), following the manufacturer's instructions. PGR 
amplification was performed as follows: 95**C 1 0 min, then 40 cycles of 95^*0 for 1 5 seconds, 
eO^'C for 1 minute. Results were analyzed and processed as described previously. 

Panels 1, IJ, L2, and 13D 

The plates for Panels 1, LI, 1.2 and 1.3D include 2 control wells (genomic DNA 
control and chemistry control) and 94 wells containing cDNA from various samples. The 
samples in these panels are broken into 2 classes: samples derived from cultured cell lines 
and samples derived from primary normal tissues. The cell lines are derived from cancers of 
the following types: lung cancer, breast cancer, melanoma, colon cancer, prostate cancer, 
CNS cancer, squamous cell carcinoma, ovarian cancer, liver cancer, renal cancer, gastric 
cancer and pancreatic cancer. Cell lines used in these panels are widely available through the 
American Type Culture Collection (ATCC), a repository for cultured cell lines, and were 
cultured using the conditions recommended by the ATCC. The normal tissues found on these 
panels are comprised of samples derived from all major organ systems from single adult 
individuals or fetuses. These samples are derived from the following organs: adult skeletal 
muscle, fetal skeletal muscle, adult heart, fetal heart, adult kidney, fetal kidney, adult liver, 
fetal liver, adult lung, fetal lung, various regions of the brain, the spleen, bone marrow, lymph 
node, pancreas, salivary gland, pituitary gland, adrenal gland, spinal cord, thymus, stomach, 
small intestine, colon, bladder, trachea, breast, ovary, uterus, placenta, prostate, testis and 
adipose. 

In the results for Panels 1, 1.1, 1.2 and 1.3D, the following abbreviations are used: 

ca. = carcinoma, 
* = established from metastasis, 
met = metastasis, 
s cell var = small cell variant, 
non-s = non-sm = non-small, 
squam = squamous, 
pi. eff == pi effusion = pleural effiision, 
glio = glioma, 
astro = astrocytoma, and 
neuro = neuroblastoma. 
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General screening panel, _vl .4 

The plates for Panel 1.4 include 2 control wells (genomic DNA control and chemistry 
control) and 94 wells containing cDNA from various samples. The samples in Panel 1 .4 are 
broken into 2 classes: samples derived from cultured cell lines and samples derived from 
primary normal tissues. The cell lines are derived from cancers of the following types: lung 
cancer, breast cancer, melanoma, colon cancer, prostate cancer, CNS cancer, squamous cell 
carcinoma, ovarian cancer, liver cancer, renal cancer, gastric cancer and pancreatic cancer. 
Cell lines used in Panel 1.4 are widely available through the American Type Culture 
Collection (ATCC), a repository for cultured cell lines, and were cultured using the 
conditions recommended by the ATCC. The normal tissues found on Panel L4 are comprised 
of pools of samples derived from all major organ systems from 2 to 5 different adult 
individuals or fetuses. These samples are derived from the following organs: adult skeletal 
muscle, fetal skeletal muscle, adult heart, fetal heart, adult kidney, fetal kidney, adult liver, 
fetal liver, adult lung, fetal lung, various regions of the brain, the spleen, bone marrow, lymph 
node, pancreas, salivary gland, pituitary gland, adrenal gland, spinal cord, thymus, stomach, 
small intestine, colon, bladder, trachea, breast, ovary, uterus, placenta, prostate, testis and 
adipose. Abbreviations are as described for Panels 1, 1.1, 1.2, and 1.3D. 

Panels 2D and 2.2 

The plates for Panels 2D and 2.2 generally include 2 control wells and 94 test samples 
composed of RNA or cDNA isolated from human tissue procured by surgeons working in 
close cooperation with the National Cancer Institute's Cooperative Human Tissue Network 
(CHTN) or the National Disease Research Initiative (NDRI). The tissues are derived from 
human malignancies and in cases where indicated many malignant tissues have "matched 
margins" obtained from noncancerous tissue just adjacent to the tumor. Hiese are termed 
normal adjacent tissues and are denoted "NAT" in the results below. The tumor tissue and the 
"matched margins" are evaluated by two independent pathologists (the surgical pathologists 
and again by a pathologist at NDRI or CHTN). This analysis provides a gross 
histopathological assessment of tumor differentiation grade. Moreover, most samples include 
the original surgical pathology report that provides information regarding the clinical stage of 
the patient. These matched margins are taken from the tissue surrounding (i.e. immediately 
proximal) to the zone of surgery (designated "NAT", for normal adjacent tissue, in Table 
RR). In addition, RNA and cDNA samples were obtained from various human tissues derived 
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fix>m autopsies perfonned on elderly people or sudden death victims (accidents, etc.). These 
tissues were ascertained to be free of disease and were purchased from various commercial 
sources such as Clontech (Palo Alto, CA), Research Genetics, and Invitrogen. 

Panel 3D 

The plates of Panel 3D are comprised of 94 cDNA samples and two control samples. 
Specifically, 92 of these samples are derived from cultured human cancer cell lines, 2 
samples of human primary cerebellar tissue and 2 controls. The human cell lines are 
generally obtained from ATCC (American Type Culture Collection), NCI or the German 
tumor cell bank and fall into the following tissue groups: Squamous cell carcinoma of the 
tongue, breast cancer, prostate cancer, melanoma, epidermoid carcinoma, sarcomas, bladder 
carcinomas, pancreatic cancers, kidney cancers, leukemias/lymphomas, 
ovarian/uterine/cervical, gastric, colon, lung and CNS cancer cell lines. In addition, there are 
two independent samples of cerebellum. These cells are all cultured under standard 
recommended conditions and RNA extracted using the standard procedures. The cell lines in 
panel 3D and 1 .3D are of the most common cell lines used in the scientific literature. 

Panels 4D, 4R, and 4.1D 

Panel 4 includes samples on a 96 well plate (2 control wells, 94 test samples) 
composed of RNA (Panel 4R) or cDNA (Panels 4D/4.1D) isolated from various human cell 
lines or tissues related to inflammatory conditions. Total RNA from control normal tissues 
such as colon and lung (Stratagene, La Jolla, CA) and thymus and kidney (Clontech) was 
employed. Total RNA from liver tissue from cirrhosis patients and kidney from lupus patients 
was obtained from BioChain (Biochain Institute, Inc., Hayward, CA). Intestinal tissue for 
RNA preparation from patients diagnosed as having Crohn's disease and ulcerative colitis 
was obtained from the National EHsease Research Interchange (NDRI) (Philadelphia, PA). 

Astrocytes, lung fibroblasts, dermal fibroblasts, coronary artery smooth muscle cells, 
small airway epithelium, bronchial epithelium, microvascular dermal endothelial cells, 
microvascular lung endothelial cells, human pulmonary aortic endothelial cells, human 
umbilical vein endothelial cells were all purchased from Clonetics (Walkersville, MD) and 
grown in the media supplied for these cell types by Clonetics. These primary cell types were 
activated with various cytokines or combinations of cytokines for 6 and/or 12-14 hours, as 
indicated. The following cytokines were used; IL-1 beta at approximately l-5ng/ml, TNF 
alpha at approximately 5-lOng/ml, IFN gamma at approximately 20-50ng/ml, IL-4 at 
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approximately 5-lOng/ml, IL-9 at approximately 5-lOng/ml, IL-13 at approximately 5- 
lOng/ml. Endothelial cells were sometimes starved for various times by culture in the basal 
media from Clonetics with 0.1% serum. 

Mononuclear cells were prepared from blood of employees at CuraGen Corporation, 
using Ficoll. LAK cells were prepared from these cells by culture in DMEM 5% FCS 
(Hyclone), lOOfiM non essential amino acids (Gibco/Life Technologies, Rockville, MD), 
ImM sodium pyruvate (Gibco), mercaptoethanol 5.5x10*^ (Gibco), and lOmM Hepes 
(Gibco) and Interleukin 2 for 4-6 days. Cells were then either activated with 10-20ng/ml 
PMA and l-2^g/ml ionomycin, IL-12 at 5-lOng/ml, IFN gamma at 20-50ng/ml and IL-18 at 
5-lOng/ml for 6 hours. In some cases, mononuclear cells were cultured for 4-5 days in 
DMEM 5% FCS (Hyclone), lOO^iM non essential amino acids (Gibco), ImM sodium 
pyruvate (Gibco), mercaptoethanol 5.5x1 0'^M (Gibco), and lOmM Hepes (Gibco) with PHA 
(phytohemagglutinin) or PWM (pokeweed mitogen) at approximately 5^g/ml. Samples were 
taken at 24, 48 and 72 hours for RNA preparation. MLR (mixed lymphocyte reaction) 
samples were obtained by taking blood from two donors, isolating the mononuclear cells 
using Ficoll and mixing the isolated mononuclear cells 1:1 at a final concentration of 
approximately 2xl0^cells/ml in DMEM 5% FCS (Hyclone), lOO^M non essential amino 
acids (Gibco), ImM sodium pyruvate (Gibco), mercaptoethanol (5.5x1 0**^M) (Gibco), and 
lOmM Hepes (Gibco). The MLR was cultured and samples taken at various time points 
ranging from 1- 7 days for RNA preparation. 

Monocytes were isolated from mononuclear cells using CD 14 Miltenyi Beads, +ve 
VS selection columns and a Vario Magnet according to the manufacturer's instructions. 
Monocytes were differentiated into dendritic cells by culture in DMEM 5% fetal calf serum 
(FCS) (Hyclone, Logan, UT), 100|iM non essential amino acids (Gibco), ImM sodium 
pyruvate (Gibco), mercaptoethanol 5.5x1 0'^M (Gibco), and lOmM Hepes (Gibco), 50ng/ml 
GMCSF and 5ng/ml IL-4 for 5-7 days. Macrophages were prepared by culture of monocytes 
for 5-7 days in DMEM 5% FCS (Hyclone), lOOjiM non essential amino acids (Gibco), ImM 
sodium pyruvate (Gibco), mercaptoethanol 5.5x1 0'^M (Gibco), lOmM Hepes (Gibco) and 
10% AB Human Serum or MCSF at approximately 50ng/ml. Monocytes, macrophages and 
dendritic cells were stimulated for 6 and 12-14 hours with lipopolysaccharide (LPS) at 
lOOng/mL Dendritic cells were also stimulated with anti-CD40 monoclonal antibody 
(Pharmingen) at 10|xg/ml for 6 and 12-14 hours. 

CD4 lymphocytes, CDS lymphocytes and NK cells were also isolated from 
mononuclear cells using CD4, CDS and CD56 Miltenyi beads, positive VS selection columns 
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and a Vario Magnet according to the manufacturer's instructions. CD45RA and CD45RO 
CD4 lymphocytes were isolated by depleting mononuclear cells of CDS, CD56, CD 14 and 
CD19 cells using CDS, CD56, CD14 and CD19 Miltenyi beads and positive selection. 
CD45RO beads were then used to isolate the CD45RO CD4 lymphocytes with the remaining 
cells being CD45RA CD4 lymphocytes. CD45RA CD4, CD45RO CD4 and CDS 
lymphocytes were placed in DMEM 5% FCS (Hyclone), lOOjiM non essential amino acids 
(Gibco), ImM sodium pyruvate (Gibco), mercaptoethanol 5.5xlO'^M (Gibco), and lOmM 
Hepes (Gibco) and plated at lO^cells/ml onto Falcon 6 well tissue culture plates that had been 
coated overnight with 0.5^g/ml anti-CD2S (Pharmingen) and 3ug/ml anti-CD3 (OKT3, 
ATCC) in PBS. After 6 and 24 hours, the cells were harvested for RNA preparation. To 
prepare chronically activated CDS lymphocytes, we activated the isolated CDS lymphocytes 
for 4 days on anti-CD28 and anti-CD3 coated plates and then harvested the cells and 
expanded them in DMEM 5% FCS (Hyclone), lOOfiM non essential amino acids (Gibco), 
ImM sodium pyruvate (Gibco), mercaptoethanol 5.5x1 0'^M (Gibco), and lOmM Hepes 
(Gibco) and IL-2. The expanded CDS cells were then activated again with plate bound anti- 
CD3 and anti-CD28 for 4 days and expanded as before. RNA was isolated 6 and 24 hours 
after the second activation and after 4 days of the second expansion culture. The isolated NK 
cells were cultured in DMEM 5% FCS (Hyclone), 100|iM non essential amino acids (Gibco), 
ImM sodium pyruvate (Gibco), mercaptoethanol 5.5x1 0"^M (Gibco), and lOmM Hepes 
(Gibco) and IL-2 for 4-6 days before RNA was prepared. 

To obtain B cells, tonsils were procured from NDRL The tonsil was cut up with 
sterile dissecting scissors and then passed through a sieve. Tonsil cells were then spun down 
and resupended at lO^cells/ml in DMEM 5% FCS (Hyclone), lOO^M non essential amino 
acids (Gibco), ImM sodium pyruvate (Gibco), mercaptoethanol 5.5x1 0'^M (Gibco), and 
lOmM Hepes (Gibco). To activate the cells, we used PWM at 5^g/ml or anti-CD40 
(Pharmingen) at approximately lO^g/ml and IL-4 at 5-lOng/ml. Cells were harvested for 
RNA preparation at 24,48 and 72 hours. 

To prepare the primary and secondary Thl/Th2 and Trl cells, six-well Falcon plates 
were coated overnight with lOpg/ml anti-CD28 (Pharmingen) and 2^g/ml OKT3 (ATCC), 
and then washed twice with PBS. Umbilical cord blood CD4 lymphocytes (Poietic Systems, 
German Town, MD) were cultured at 10^-lO^ceIls/ml in DMEM 5% FCS (Hyclone), lOO^M 
non essential amino acids (Gibco), ImM sodium pyruvate (Gibco), mercaptoethanol 5.5x10' 
^M (Gibco), lOmM Hepes (Gibco) and IL-2 (4ng/ml). IL-12 (5ng/ml) and anti-IL4 (iMg/ml) 
were used to direct to Thl, while IL-4 (5ng/ml) and anti-IFN gamma (1 ^g/ml) were used to 
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direct to Th2 and IL-10 at 5ng/ml was used to direct to Trl . After 4-5 days, the activated Thl, 
Th2 and Trl lymphocytes were washed once in DMEM and expanded for 4-7 days in DMEM 
5% FCS (Hyclone), lOOjiM non essential amino acids (Gibco), ImM sodium pyruvate 
(Gibco), mercaptoethanol 5.5xlO'^M (Gibco), lOmM Hepes (Gibco) and IL-2 (Ing/ml). 
Following this, the activated Thl, Th2 and Trl lymphocytes were re-stimulated for 5 days 
with anti-CD28/OKT3 and cytokines as described above, but with the addition of anti- 
CD95L (1 ng/ml) to prevent apoptosis. After 4-5 days, the Thl, Th2 and Trl lymphocytes 
were washed and then expanded again with IL-2 for 4-7 days. Activated Thl and Th2 
lymphocytes were maintained in this way for a maximum of three cycles. RNA was prepared 
from primary and secondary Thl, Th2 and Trl after 6 and 24 hours following the second and 
third activations with plate bound anti-CD3 and anti-CD28 mAbs and 4 days into the second 
and third expansion cultures in Interleukin 2. 

The following leukocyte cells lines were obtained from the ATCC: Ramos, EOL-1, 
KU-812. EOL cells were ftirther differentiated by culture in O.lmM dbcAMP at 
5xl0^cells/ml for 8 days, changing the media every 3 days and adjusting the cell 
concentration to 5xl0^cells/ml. For the culture of these cells, we used DMEM or RPMI (as 
recommended by the ATCC), with the addition of 5% FCS (Hyclone), lOOuM non essential 
amino acids (Gibco), ImM sodium pyruvate (Gibco), mercaptoethanol 5.5xlO"^M (Gibco), 
lOmM Hepes (Gibco). RNA was either prepared from resting cells or cells activated with 
PMA at lOng/ml and ionomycin at l^ig/ml for 6 and 14 hours. Keratinocyte line CCD106 and 
an airway epithelial tumor line NCI-H292 were also obtained from the ATCC. Both were 
cultured in DMEM 5% FCS (Hyclone), lOO^M non essential amino acids (Gibco), ImM 
sodium pyruvate (Gibco), mercaptoethanol 5.5x10'^ (Gibco), and lOmM Hepes (Gibco). 
CCD1106 cells were activated for 6 and 14 hours with approximately 5 ng/ml TNF alpha and 
Ing/ml IL-1 beta, while NCI-H292 cells were activated for 6 and 14 hours with the following 
cytokines: 5ng/ml IL-4, 5ng/ml IL-9, Sng/ml IL-1 3 and 25ng/ml IFN gamma. 

For these cell lines and blood cells, RNA was prepared by lysing approximately 
lO^cells/ml using Trizol (Gibco BRL). Briefly, 1/10 volume of bromochloropropane 
(Molecular Research Corporation) was added to the RNA sample, vortexed and after 10 
minutes at room temperature, the tubes were spun at 14,000 rpm in a Sorvall SS34 rotor. The 
aqueous phase was removed and placed in a 15ml Falcon Tube. An equal volume of 
isopropanol was added and left at -20''C overnight. The precipitated RNA was spun down at 
9,000 rpm for 15 min in a Sorvall SS34 rotor and washed in 70% ethanol. The pellet was 
redissolved in 300|il of RNAse-free water and 35|iil buffer (Promega) 5^1 DTT, 7^1 RNAsin 
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and 8^1 DNAse were added. The tube was incubated at 37°C for 30 minutes to remove 
contaminating genomic DNA, extracted once with phenol chloroform and re-precipitated 
with 1/10 volume of 3M sodium acetate and 2 volumes of 100% ethanol. The RNA was spun 
down and placed in RNAse free water. RNA was stored at -SO'^C. 

Al comprehensive panel_vl.O 

The plates for AI__comprehensive panel^vl .0 include two control wells and 89 test 
samples comprised of cDNA isolated from surgical and postmortem human tissues obtained 
from the Backus Hospital and Clinomics (Frederick, MD). Total RNA was extracted from 
tissue samples from the Backus Hospital in the Facility at CuraGen. Total RNA from other 
tissues was obtained from Clinomics. 

Joint tissues mcluding synovial fluid, synovium, bone and cartilage were obtained 
from patients undergoing total knee or hip replacement surgery at the Backus Hospital. 
Tissue samples were immediately snap frozen in liquid nitrogen to ensure that isolated RNA 
was of optimal quality and not degraded. Additional samples of osteoarthritis and rheumatoid 
arthritis joint tissues were obtained from Clinomics. Normal control tissues were supplied by 
Clinomics and were obtained during autopsy of trauma victims. 

Surgical specimens of psoriatic tissues and adjacent matched tissues were provided as 
total RNA by Clinomics. Two male and two female patients were selected between the ages 
of 25 and 47. None of the patients were taking prescription drugs at the time samples were 
isolated. 

Surgical specimens of diseased colon from patients with ulcerative colitis and Crohns 
disease and adjacent matched tissues were obtained from Clinomics. Bowel tissue from three 
female and three male Crohn's patients between the ages of 41-69 were used. Two patients 
were not on prescription medication while the others were taking dexamethasone, 
phenobarbital, or tylenol. Ulcerative colitis tissue was from three male and four female 
patients* Four of the patients were taking lebvid and two were on phenobarbital. 

Total RNA from post mortem lung tissue from trauma victims with no disease or with 
emphysema, asthma or COPD was purchased from Clinomics. Emphysema patients ranged in 
age from 40-70 and all were smokers, this age range was chosen to focus on patients with 
cigarette-linked emphysema and to avoid those patients with alpha-lanti-trypsin deficiencies. 
Asthma patients ranged in age from 36-75, and excluded smokers to prevent those patients 
that could also have COPD. COPD patients ranged in age from 35-80 and included both 
smokers and non-smokers. Most patients were taking corticosteroids, and bronchodilators. 
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In the labels employed to identify tissues in the AI_comprehensive panel_vl.O panel, 
the following abbreviations are used: 
AI = Autoimmunity 

Syn = Synovial 
5 Normal = No apparent disease 

Rep22 /Rep20 = individual patients 

RA = Rheumatoid arthritis 

Backus = From Backus Hospital 

OA = Osteoarthritis 
10 (SS) (BA) (MF) = Individual patients 

Adj = Adjacent tissue 

Match control = adjacent tissues 

-M = Male 

-F = Female 

15 COPD = Chronic obstructive pulmonary disease 

Panels 5D and 51 

The plates for Panel 5D and 51 include two control wells and a variety of cDNAs 
isolated from human tissues and cell lines with an emphasis on metabolic diseases. Metabolic 
tissues were obtained from patients enrolled in the Gestational Diabetes study. Cells were 

20 obtained during different stages in the differentiation of adipocytes from human 
mesenchymal stem cells. Human pancreatic islets were also obtained. 

In the Gestational Diabetes study subjects are young (1 8 - 40 years), otherwise 
healthy women with and without gestational diabetes undergoing routine (elective) Caesarean 
section. After delivery of the infant, when the surgical incisions were being repaired/closed, 

25 the obstetrician removed a small sample (<1 cc) of the exposed metabolic tissues during the 
closure of each surgical level. The biopsy material was rinsed in sterile saline, blotted and 
fast frozen within 5 minutes from the time of removal. The tissue was then flash frozen in 
liquid nitrogen and stored, individually, in sterile screw-top tubes and kept on dry ice for 
shipment to or to be picked up by CuraGen. The metabolic tissues of interest include uterine 

30 wall (smooth muscle), visceral adipose, skeletal muscle (rectus) and subcutaneous adipose. 
Patient descriptions are as follows: 



Patient 2 



Diabetic Hispanic, overweight, not on insulin 
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Patients 7-9 Nondiabetic Caucasian and obese (BMI>30) 
Patient 10 Diabetic Hispanic, overweight, on insulin 
Patient 1 1 Nondiabetic African American and overweight 
Patient 12 Diabetic Hispanic on insulin 

Adipocyte differentiation was induced in donor progenitor cells obtained from Osirus 
(a division of Clonetics/BioWhittaker) in triplicate, except for Donor 3U which had only two 
replicates. Scientists at Clonetics isolated, grew and differentiated human mesenchymal stem 
cells (HuMSCs) for CuraGen based on the published protocol found in Mark F. Pittenger, et 
al, Multilineage Potential of Adult Human Mesenchymal Stem Cells Science Apr 2 1999: 
143-147. Clonetics provided Trizol lysates or frozen pellets suitable for mRNA isolation and 
ds cDNA production. A general description of each donor is as follows: 

Donors 2 and 3 U: Mesenchymal Stem cells. Undifferentiated Adipose 
Donors 2 and 3 AM: Adipose, AdiposeMidway Differentiated 
Donosr 2 and 3 AD: Adipose, Adipose Differentiated 

Human cell lines were generally obtained from ATCC (American Type Culture 
Collection), NCI or the German tumor cell bank and fall into the following tissue groups: 
kidney proximal convoluted tubule, uterine smooth muscle cells, small intestine, liver HepG2 
cancer cells, heart primary stromal cells, and adrenal cortical adenoma cells. These cells are 
all cultured under standard recommended conditions and RNA extracted using the standard 
procedures. All samples were processed at CuraGen to produce single stranded cDNA. 

Panel 51 contains all samples previously described with the addition of pancreatic 
islets from a 58 year old female patient obtained from the Diabetes Research Institute at the 
University of Miami School of Medicine. Islet tissue was processed to total RNA at an 
outside source and delivered to CuraGen for addition to panel 51. 

In the labels employed to identify tissues in the 5D and 51 panels, the following 
abbreviations are used: 

GO Adipose = Greater Omentum Adipose 

SK = Skeletal Muscle 

UT = Uterus 

PL Placenta 

AD = Adipose Differentiated 

AM = Adipose Midway Differentiated 

U = Undifferentiated Stem Cells 

387 



Panel CNSD.Ol 

The plates for Panel CNSD.Ol include two control wells and 94 test samples 
comprised of cDNA isolated from postmortem human brain tissue obtained from the Harvard 
Brain Tissue Resource Center. Brains are removed from calvaria of donors between 4 and 24 
hours after death, sectioned by neuroanatomists, and frozen at -80^C in liquid nitrogen vapor. 
All brains are sectioned and examined by neuropathologists to confirm diagnoses with clear 
associated neuropathology. 

Disease diagnoses are taken from patient records. The panel contains two brains from 
each of the following diagnoses: Alzheimer's disease, Parkinson's disease, Huntington's 
disease. Progressive Supemuclear Palsy, Depression, and "Normal controls". Within each of 
these brains, the following regions are represented: cingulate gyrus, temporal pole, globus 
palladus, substantia nigra, Brodman Area 4 (primary motor strip), Brodman Area 7 (parietal 
cortex), Brodman Area 9 (prefrontal cortex), and Brodman area 17 (occipital cortex). Not all 
brain regions are represented in all cases; e.g,, Huntington's disease is characterized in part by 
neurodegeneration in the globus palladus, thus this region is impossible to obtain from 
confirmed Huntington's cases. Likewise Parkinson's disease is characterized by degeneration 
of the substantia nigra making this region more difficult to obtain. Normal control brains 
were examined for neuropathology and found to be free of any pathology consistent with 
neurodegeneration . 

In the labels employed to identify tissues in the CNS panel, the following 
abbreviations are used: 

PSP == Progressive supranuclear palsy 

Sub Nigra = Substantia nigra 

Glob Palladus= Globus palladus 

Temp Pole = Temporal pole 

Cing Gyr = Cingulate gyrus 

BA 4 = Brodman Area 4 

Panel CNS_Neurodegeneration__V1.0 

The plates for Panel CNS_Neurodegeneration_Vl .0 include two control wells and 47 
test samples comprised of cDNA isolated from postmortem human brain tissue obtained from 
the Harvard Brain Tissue Resource Center (McLean Hospital) and the Human Brain and 
Spinal Fluid Resource Center (VA Greater Los Angeles Healthcare System). Brains are 
removed from calvaria of donors between 4 and 24 hours after deaths sectioned by 
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neuroanatomists, and frozen at -80**C in liquid nitrogen vapor. All brains are sectioned and 
examined by neuropathologists to confirm diagnoses with clear associated neuropathology. 

Disease diagnoses are taken from patient records. The panel contains six brains from 
Alzheimer's disease (AD) patients, and eight brains from •'Normal controls" who showed no 
evidence of dementia prior to death. The eight normal control brains are divided into two 
categories: Controls with no dementia and no Alzheimer's like pathology (Controls) and 
controls with no dementia but evidence of severe Alzheimer's like pathology, (specifically 
senile plaque load rated as level 3 on a scale of 0-3; 0 = no evidence of plaques, 3 = severe 
AD senile plaque load). Within each of these brains, the following regions are represented: 
hippocampus, temporal cortex (Brodman Area 21), parietal cortex (Brodman area 7), and 
occipital cortex (Brodman area 17). These regions were chosen to encompass all levels of 
neurodegeneration in AD. The hippocampus is a region of early and severe neuronal loss in 
AD; the temporal cortex is known to show neurodegeneration in AD after the hippocampus; 
the parietal cortex shows moderate neuronal death in the late stages of the disease; the 
occipital cortex is spared in AD and therefore acts as a "control" region within AD patients. 
Not all brain regions are represented in all cases. 

In the labels employed to identify tissues in the CNSJsfeurodegeneration_Vl .0 panel, 
the following abbreviations are used: 

AD = Alzheimer's disease brain; patient was demented and showed AD-like 

pathology upon autopsy 

Control = Control brains; patient not demented, showing no neuropathology 
Control (Path) = Control brains; pateint not demented but showing sever AD-like 
pathology 

SupTemporal Ctx = Superior Temporal Cortex 
Inf Temporal Ctx = Inferior Temporal Cortex 

A. NOV2 - CG57107-01 : Pepsin A Precursor 

Expression of the NOV2 gene was assessed using the primer-probe set Ag809, 
described in Table AA. Results of the RTQ-PCR runs are shown in Tables AB, AC, AD, AE, 
AF, and AG. 

Table AA . Probe Name Ag809 



Primers 


Sequences 


Length 


- ISEQID 
Start Positioni-^^ 


Forward 


5'-atgtgatctttggctgtgaagt-3* 


22 |257 |461 
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Probe |TET-5-ctaccccatggcctccatcgagt-3-TAMRA 


23 


228 


462 


Reverse |5 -ggatgtccaagccatcctt-3* 


19 


204 


463 



Table AB . General_screeningj)anel__vl.4 



Tissue Name 


Rel ExDr%) 
Ag809, Run 
220283339 


Tissue Name 


IvCl. 

Exp.(%) 

Ag809, 

Run 

220283339 


Adipose 


0.7 


Renal ca. TK-10 


6.7 


Melanoma* Hs688(A).T 


4.5 


Bladder 


2.7 


Melanoma* Hs688(B).T 


12.1 


Gastric ca. (liver met.) NCi-N87 


7.0 


Melanoma* M14 


10.7 


Gastric ca. KATO III 


10.0 


Melanoma* LOXIMVI 


0.8 


Colon ca. SW-948 


2.9 


Melanoma* SK-MEL-5 


0.9 


Colon ca. SW480 


27.5 


Squamous cell carcinoma SCC-4 


6.3 


Colon ca.* (SW480 met) SW620 


18.6 


Testis Pool 


1.5 


Colon ca. HT29 


3.5 


Prostate ca.* (bone met) PC-3 


23.8 : 


Colon ca. HCT-116 


5.0 


Prostate Pool 


2.7 


Colon ca. CaCo-2 


28.5 


Placenta 


0.6 


Colon cancer tissue 


3.1 


Uterus Pool 


1.2 


Colon ca.SWl 116 


5.8 


Ovarian ca. OVCAR-3 


7.7 


Colon ca. Colo-205 


5.8 


Ovarian ca. SK-OV-3 


0.4 


Colon ca. SW-48 


2.0 


Ovarian ca OVCAR-4 


0.9 


Colon Pool 


3.0 


Ovarian ca. OVCAR-5 


36.3 


Small Intestine Pool 


1.3 


Ovarian ca. IGROV-1 


5.5 


Stomach Pool 


2.6 


Ovarian ca. OVCAR-8 


5.5 


Bone Marrow Pool 


0.6 


Ovary 


0.6 


Fetal Heart 


1.6 


Breast ca. MCF-7 


4.6 


Heart Pool 


13 


Breast ca. MDA-MB-231 


7.2 


Lymph Node Pool 


42 


Breast ca. BT 549 


17.0 


Fetal Skeletal Muscle 


4.9 


Breast ca.T47D 


100.0 


Skeletal Muscle Pool 


2.0 


Breast ca. MDA-N 


13.2 


Spleen Pool 


11.7 


Breast Pool 


3.2 


Thymus Pool 


4.1 


Trachea 


5.3 


CNS cancer (glio/astro) U87-MG 


13.0 


Lung 


0.5 


CNS cancer (glio/astro) U-1 1 8- 
MG 


72.2 


Fetal Limg 


2.9 


CNS cancer (neuro;met) SK-N- 
AS 


33.2 


Lungca.NCI-N417 


1.4 


CNS cancer (astro) SF-539 


8.8 


Lung ca. LX-1 


28.9 


CNS cancer (astro) SNB-75 


8.8 


Lungca.NCI-H146 


1.6 


CNS cancer (glio) SNB-19 


4.6 


Lung ca. SHP-77 


7.5 


CNS cancer (glio) SF-295 


62 


Lung ca. A549 


4.9 


Brain (Amygdala) Pool 


0.5 


Lungca.NCI-H526 


0.8 


Brain (cerebellum) 


0.4 



390 



Lung ca. NCI-H23 


9.7 


BrauiOfetal) J 


0.4 


Lungca.NCI-H460 


4.5 


Brain (Hippocampus) Pool J 


0.5 


Lungca. HOP-62 


L7 


Cerebral Cortex Pool 


0.7 


Lung ca.NCI-H522 


23.2 


Brain (Substantia nigra) Pool 


0.5 


Liver 


0.0 


Brain (Thalamus) Pool 


0.8 


Fetal Liver 


LO 


Brain (whole) 


0.9 


Liver ca. HepG2 


14.8 


Spmal Cord Pool 


0.4 


Kidney Pool 


3.6 


Adrenal Gland 


0.3 


Fetal Kidney 


L6 


Pituitary gland Pool 


|0.6 


Renal ca. 786-0 


7.8 


Salivary Gland 


4.7 


Renal ca. A498 


3.9 


Thyroid (female) 


0.6 


Renal ca. ACHN 


4.4 


Pancreatic ca. CAPAN2 




Renal ca. UO-31 


0.5 


Pancreas Pool 


|3.1 



Table AC. Panel 1.2 



Tissue Name 


Rel. 

Exp.(%) 
Ag809, Run 
118348423 


Rel. Exp.(%) 
/Agouy, K.un 
121953937 


1 issue INaluC 


Rel. Exp.(%) 

AoROO Run 

118348423 


Rel. 

Exp.(%) 
Run 

121953937 


Endothelial cells 


0.0 


0.0 


Renal ca. 786-0 


5.0 


6.7 


Heart (Fetal) 






Jtvenai ca. /\4yo 


7 7 


19 n 


Pancreas 


7.4 


0.2 


Renal ca. RXF 393 


0.7 


0.7 


Pancreatic ca. CAPAN 2\ 


1.9 


4.1 


Kenai ca. ai^oN 


0.1 


7 7 


Adrenal Gland 


3.1 


6.7 


Renal ca. UO-31 


0.0 


0.0 


Thjroid 


6.7 


1.0 


Renal ca, TK-10 


0.1 


0.0 


Salivary gland 


40.3 


63.3 


Liver 


1.1 


1.3 


Pituitary gland 


16.4 


14.1 ; 


Liver (fetal) 


0.8 


1.7 


Brain (fetal) 


0.7 


0.0 


Liver ca. 

(hepatoblast) 

HepG2 


16.2 


49.0 


Brain (whole) 


2.1 


3.4 


Lung 


0.5 


1.6 


Brain (amygdala) 


1.0 


1.6 


Lung (fetal) 


3.1 


1.9 


Brain (cerebellum) 


0.3 


0.7 


Lung ca. (small 
cell)LX-l 


44.4 


33.4 


Brain (hippocampus) 


2.7 


6.5 


Lung ca. (small 
cell)NCI-H69 


1.8 


0.3 


Brain (thalamus) 


1.1 


0.8 


Lung ca. (s.cell 
var.) SHP-77 


4.3 


4.3 


Cerebral Cortex 


3.4 


8.8 


Lung ca. (large 
ceii)NCI-H460 


13.1 


45.7 


Spinal cord 


0.7 


0.7 


Lung ca. (non-sm. 
cell)A549 


9.4 


15.6 


glio/astro U87-MG 


11.4 


8.2 


Lung ca. (non- 
s.cell) NCI-H23 


11.0 


12.9 


giio/astroU-118-MG 


212 


24.1 


Lung ca. (non- 
s.cell) HOP-62 


6.2 


2.7 
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astrocytoma SW1783 


1.6 


- _ iLung ca. (non-s.cl) 

!nCI-H522 


81.2 


62.4 


nexiro*; met SK-N-AS 


65.1 


46.7 


Lungca.(squam.) 
SW900 


6.6 


13.8 


astrocytoma SF-539 


5.6 


9.2 


Tun 2 ca Tsouam ^ 
NCI-H596 


2.0 


0.1 


astrocytoma SNB-75 


0.5 


0.0 


Mammary gland 


5.3 


4.6 


glioma SNB-1 9 


2.4 


2.9 


Breast ca.* (pl.ef) 

MCF-7 


4.2 


6.4 


glioma U251 


1.1 


0.9 


Breast ca.* (pl.ef) 
MDA-MB-231 


2.0 


6.0 


glioma SF-295 


2.7 


0.5 


Breast ca.* (pi. ef) 
T47D 


100.0 


100.0 


Heart 


39.2 


77.4 


Breast ca. BT-549 | 


3.0 


5.7 


Skeletal Muscle 


52.1 




ore obi Ud., iVJLL//\ IN 


17.4 


20.4 


Bone marrow 


0.0 


1.2 3 


Ovary 


2.1 


3.8 


Thymus 


03 


0.0 


Ovarian ca. 
OVCAR-3 


8.4 


43 


Spleen 


19.5 


21.9 


Ovarian ca. 


2.3 


3.1 


Lymph node 


0.3 


6.9 


Ovarian ca. 
rvvr* AT? ^ 

vyVv-'AlvO 


8.9 


7.0 


Colorectal Tissue 


0.1 


1.1 


Ovarian ca. 
OVCAR-8 


4.9 


13.1 


Stomach 


18.9 


27.7 


IGROV-1 


7.0 


7.7 


Small intestine 


3.1 


8.4 


iOvarian ca. (ascites) 
SK-OV-3 


0.0 


0.0 


Colon ca. SW480 


5.2 


8.3 


Uterus 


3.8 


8.7 


Colon ca.* SW620 
(SW480 met) 


19.2 


20.4 


Placenta 


1.9 


5.1 


Colon ca. HT29 


4.1 






13.4 


33.0 


Colon ca. HCT-116 


2.0 


1.4 


Prrt^tftlf* ca * rhoTie 
met) PC-3 


42.9 


76.3 


Colon ca. CaCo-2 


35.1 






8.1 


9.4 


Colon ca. Tissue 
(OE)03866) 


0.6 


3.8 


Melanoma 


5.3 


6.0 


Colon ca. HCC-2998 


36.9 


54.0 


Melanoma* (met) 
Hs688(B).T 


3.5 


3.4 


Gastric ca.* (liver met) 
NCI-N87 


11.9 


14.9 


Melanoma UACC- 
62 


2.9 


3.7 


Bladder 


7.0 


13.8 


Melanoma M 14 


11.5 


21.9 


Trachea 


6.0 


9.3 


Melanoma LOX 
IMVI 


2.6 


1.8 


Kidney 


2.7 


6.0 


Melanoma* (met) 
SK-MEL-5 


1.7 


32 


Kidney (fetal) 


6.5 


28.5 


I 





Table AD . Panel 1 JD 
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Tissue Name 


ReL Exp.(%) 
Ag809, Run 
1 52548777 


Tissue Name 


Rel. Exp.(%) 
Ag809, Run 

IDZD^o/ / / 


Liver adenocarcinoma 


54.3 


Kidney (fetal) 


1 .5 


Pancreas 


1.4 


Renal ca. /oo-U 




Pancreatic ca. CAPAN 2 


3.2 


Renal ca. A4yo 


4./ 


Adrenal gland 


1.6 


Renal ca. KAr 393 


3.2 


Thyroid 


A A 

4.4 


Renal ca. ACHN 


o./ 


Salivary gland 


8.4 


Renal ca.UO-31 


0.9 


Pituitary gland 


6.0 


Renal ca. TK-lu 


0.4 


Brain (fetal) 


0.0 


Liver 


0.6 


Brain (whole) 


2.0 


Liver (fetal) 


0.7 


Brain (amygdala) 


1.6 


Liver ca. (hepatoblast) HepG2 


36.9 


Brain (cerebellum) 


0.0 


Lung 


2.7 


Brain (hippocampus) 


8.9 


Lung (fetal) 


6.1 


Brain (substantia nigra) 


0.6 


Lung ca. (small cell) LX-1 


23.5 


Brain (thalamus) 


2.0 


Lung ca. (small cell) NCI-H69 


2.0 


Cerebral Cortex 


1.6 


Lung ca. (s.cell var.) SHP-77 


8.9 


Spinal cord 


1.8 


Lung ca. (large cell)NCI-H460 


4.7 


glio/astro U87-MG 


8.2 


Lung ca. (non-sm. cell) A549 


2.4 


glio/astroU-118-MG 


68.3 


Lung ca. (non-s.cell) NCI-H23 


23.5 


astrocytoma SW1783 


4.5 


Lung ca. (non-s.cell) HOP-62 


7.2 


neuro*; met SK-N-AS 


35.6 


Lung ca. (non-s.cl) NCI-H522 


32-5 


astrocytoma SF-539 


8.6 


Lung ca. (squam.) SW 900 


3.2 


astrocytoma SNB-75 


8.1 


Lung ca. (squam.) NCI-H596 


0.5 


glioma SNB-19 


0.0 


Mammary gland 


6.8 


glioma U251 


0.6 


Breast ca.* (pl.ef) MCF-7 


3.8 


glioma SF-295 


8.6 


Breast ca.* (pl.ef) MDA-MB-231 


14.6 


Heart (fetal) 


29.9 


Breast ca.* (pl.ef) T47D 


45.1 


Heart 


9.5 


Breast ca. BT-549 


5.2 


Skeletal muscle (fetal) 


100.0 


Breast ca. MDA-N 


12.8 


Skeletal muscle 


2.2 


Ovary 


8.2 


Bone marrow 


2.5 


Ovarian ca. OVCAR-3 


2.4 


Thymus 


2.9 


Ovanan ca. OVCAR-4 


2.7 


Spleen 


42.0 


Ovarian ca. OVCAR-5 


6J2 


Lymph node 


2.3 


Ovarian ca. OVCAR-8 


8.7 


Colorectal 


8.1 


Ovarian ca. IGROV-1 


6.7 


Stomach 


19.3 


Ovarii ca.* (ascites) SK-OV-3 


0.3 


Small intestine 


3.1 


Uterus 


4.7 


Colon ca. SW480 


31.9 


Placenta 


3.2 


Colon ca.* SW620(SW480 met) 


12.9 


jProstate 


10.7 


Colon ca. HT29 


3.6 


[Prostate ca.* (bone met)PC-3 


24.5 


Colon caHCT-1 16 


3.4 


[Testis 


^9.9 


Colon ca. CaCo-2 


33.7 


[Melanoma Hs688(A).T 


[9.2 


Colon ca. tissue(OD03866) 


2.0 


fMelanoma* (met) Hs688(B).T 


|48.6 
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Colon ca.HCC-2998 


18.9 


Melanoma UACC-62 


1.2 


Gastric ca.* (liver met) NCI-N87 


7.2 


Melanoma Ml 4 


6.5 


Bladder 


2.5 


Melanoma LOX IMVI 


1.6 


Trachea 


16.0 


Melanoma* (met) SK-MEL-5 


2.3 


Kidney 


1.2 


Adipose 


4.0 



Table AE. Panel 2D 



1 issue iName 


Rel. Exp.(%) 
Ag809, Run 
152550394 


Tissue Name 


Rel. 

C/Xp-v^yo^ 

Ae809 

Run 

152550394 


Normal Colon 


6.8 1 


Kidney Margin 8120608 


1.5 


CC Well to Mod Diff (OD03866) 


6.1 jKidney Cancer 8120613 


2.0 


CC Margin (OD03866) 


2.5 


Kidney Margin 8120614 


4.1 


CC Gr.2 rectosigmoid (OD03868) 


0.9 


Kidney Cancer 9010320 


2.2 


CC Margin (OiXyiooo) 


1.2 


Kidney Margin 9010321 


J, J 


CC Mod L>in (iJL>U3VZUj 


3.8 


Normal Uterus 




CC Margin (UJLMJiyzU^ 


1.3 


Uterus Cancer 06401 1 


17 6 

1 / .V/ 


CC or.z ascenu coion ^^v-^i-a^j^zi ) 


6.9 


Normal Thyroid 


3.7 


CC Margin ^ULMj^yzi j 


4.0 


Thyroid Cancer 064010 


1 7 


CC irom rartiai riepaieciomy 
(ODO4309) Mets 


L2 


Thyroid Cancer A302152 


0.6 


Liver Margin (OIX>4309) 


0.6 


Thyroid Margin A302153 


2.6 


Colon mets to lung (OD04451-01) 


4.4 


Normal Breast 


3.3 


Lung Margin (OD04451-02) 


1.2 


Breast Cancer (OD04566) 


0.9 


Normal Prostate 6546-1 


10.2 


Breast Cancer (OD04590-01) 


67.8 


Prostate Cancer (OD04410) 


4L8 


Breast Cancer Mets 
(0004590-03) 


51.1 


Prnotatp Maroin ('000441 0^ 


25.7 


Breast Cancer Metastasis 
(OD04655-05) 


12.7 


Prostate Cancer (OD04720-01) 


n.o 


Breast Cancer 064006 


8.9 


Prostate Margin (OD04720-02) 


10.0 


Breast Cancer 1024 


7.8 


Normal Lung 061010 


7.9 


Breast Cancer 9100266 


6.2 


Lung Met to Muscle (OD04286) 


6.5 


Breast Margin 9100265 


3.3 


Muscle Margin (OD04286) 


2.6 


Breast Cancer A209073 


3.4 


Lung Malignant Cancer (OD03 126) 


14.8 


Breast Margin A209073 


8.7 


Lung Margin (OD03 126) 


3.1 


Normal Liver 


1.1 


Lung Cancer (OD04404) 


2.0 


Liver Cancer 064003 


0.6 


Lung Mar^ (OD04404) 


1.9 


[Liver Cancer 1 025 


0.6 


Lung Cancer (OD04565) 


0.3 |Liver Cancer 1026 


1.4 


Lung Margin (OD04565) 


1.9 


[Liver Cancer 6004-T 


1.3 


Lung Cancer (OD04237-01) 


1.3 


[Liver Tissue 6004-N 


1.3 


Lung Margin (0004237-02) 


2.6 


[Liver Cancer 6005-T 


1.1 


Ocular Mel Met to Liver (OD043 1 0) 


0.1 


[Liver Tissue 6005-N 


0.3 
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Liver Margin (OD043 10) j 


0.6 


|Normal Bladder 


5.9 


Melanoma Mets to Lung (OD04321) 


2,5 


jBladder Cancer 1023 


1.7 


Lung Margin (OD04321) 1 


2.6 


Bladder Cancer A302 1 73 


1.9 


Normal Kidney 


5.6 


IBladder Cancer (OD0471 8-01) 


2.0 


iviuney v^a, JiNUde<ir grduc z. \\jLy\jHjjoj 


0 6 


[Bladder Normal Adjacent 
|(OD04718-03) 


3.3 


Kidney Margin (OD04338) 


3.7 


in mr imf IT m- | 

fNormal Ovary 


2.2 


Kidney Ca Nuclear grade 1/2 
(OD04339) 


0.8 


jovarian Cancer 064008 


29.1 


Kidney Margin (OD04339) 


3.1 


jOvanan Cancer (OD04768-07) 


100.0 


Kidney Ca, Clear cell type (OD04340) 


L5 


jOvary Margm (OD04768-08) 


2.2 


Kidney Margin (OD04340) 


5.1 


fNormal Stomach 


13.1 


Kidney Ca, Nuclear grade 3 (OD04348) 


14.5 


jOastnc Cancer yuoUjDo 


1 ,D 


Kidney Margin (OD04348) 


2.5 


fStomach Margin 9060359 


8.8 


Kidney Cancer {OD04622-01) 


1.7 


jGastric Cancer 9060395 


2.5 


Kidney Margin (OD04622-03) 


2.0 


jStomach Margin 9060394 


9.7 


Kidney Cancer (OD04450-01) 


0.3 


jGastric Cancer 9060397 


15.9 


Kidney Margin (0004450-03) 


2.0 


Stomach Margin 9060396 


12.9 


Kidney Cancer 8120607 


7.0 


jGastric Cancer 064005 


12.1 



Table AF. Panel 4.1D 



Tissue Name 


Rel. Exp.(%) Ag809, 
Run 170402442 


Tissue Name 


Rel. Exp.(%) Ag809, 
Run 170402442 


Secondary Thl act 


2.5 


HUVEC IL-lbeta 


0,9 


Secondary Th2 act 


2.7 


HUVEC IFN gamma 


3.0 


Secondary Trl act 


3.8 


HUVEC TNF alpha + IFN 

gamma 


0.9 


Secondary Thl rest 


1.8 


HUVEC TNF alpha + IL4 


0.7 


Secondary Th2 rest 


0.0 


HUVEC IL-11 


2.5 


Secondary Trl rest 


0.0 


Limg Microvascular EC none 


2.5 


Primary Thl act 


0.0 


Lung Microvascular EC 
TNFalpha + IL-lbeta 


1.4 


Primary Th2 act 


0.4 


Microvascular Dermal EC 
none 


0.0 


Primary Trl act 


2.3 


Microsvasular Dermal EC 
TNFalpha + BL-lbeta 


2.3 


Primary Thl rest 


4.4 


Bronchial epithelium 
TNFalpha + ILlbeta 


2.5 


Primary Th2 rest 


1.4 


Small airway epithelium 
none 


3.2 


Primary Trl rest 


1.7 


Small airway epithelium 
TNFalpha + IL-lbeta 


0.0 


CD45RA CD4 
lymphocyte act 


14.9 


Coronery artery SMC rest 


1.5 


CD45ROCD4 
lymphocyte act 


0.7 


Coronery artery SMC 
TNFalpha + IL-lbeta 


2.7 
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CDS lymphocyte act 


0.7 


Astrocytes rest 


1.7 


Secondary clJo 
lymphocyte rest 


0.8 


Ibeta 


1.4 


oecondary ci^o 
Ivmohocvte act 


2.3 


KU-812 (Basophil) rest 


5.7 


CD4 lymi^iocyte none 


0.0 


KU-812 (Basophil) 
PMA/ionomycin 


7.1 


2rv Thl/Th2/Trl anti- 
CD95CH11 


0.0 


CCDI 106 (Keratinocytes) 
none 


3.3 


LAK cells rest 


3.2 


CCDI 106 (Keratinocytes) 
TNFalpha + IL- 1 beta 


0.5 


LAKcelkIL-2 


2A 


Liver cirrhosis 


3.3 


LAK cells IL-2+IL-12 


13 


NCI-H292 none 


36.6 


LAK cells 1L-2+IFN 
gamma 


1.2 


NCI-H292 IL-4 


38.7 


LAK cells IL-2+1L-18 


0.7 


NCI-H292 IL-9 


46.7 


LAK cells 
PMA/ionomycin 


0.9 


NCI-H292 IL-13 


37.4 


NK Cells IL-2 rest 


0.9 


NCI-H292 IFN gamma 


53.2 


Two Way MLR 3 day 


1.8 


HPAEC none 


4.0 


Two Way MLR 5 day 


1.6 


HPAECTNF alpha -fIL-1 
beta 




Two Way MLR 7 day 


2.9 


Ltrng fibroblast none 


62.4 


PBMC rest 


_ ^ iLung fibroblast TNF alpha + 
|IL-1 beta 


33.2 


PBMC PWM 


LI 


Lung fibroblast IL-4 


100.0 




3.1 


Lung fibroblast IL-9 


86.5 


Ramos (B cell) none 


0.6 


Lung fibroblast IL-13 


66.0 


Ramos (B cell) 
ionomycin 


4.9 


Lung fibroblast IFN gamma 


84.1 


B lymphocytes PWM 


2.0 


Dermal fibroblast CCDI 070 
rest 


90.8 


B Ivmnhocvtes CD40L 
and IL-4 


1.1 


Dermal fibroblast CCD1070 
TNF alpha 


25.7 


EOL-1 dbcAMP 


12.7 


Dermal fibroblast CCD 1070 
IL-I beta 


40.3 


EOL-1 dbcAMP 
PMA/ionomycin 


4.0 


Dermal fibroblast IFN 
gamma 




Dendritic cells none 


5.8 


Dermal fibroblast IL-4 


71.2 


Dendritic cells LPS 


2.8 


Etermal Fibroblasts rest 


20.3 


Dendritic cells anti- 
CD40 


L4 


Neutrophils TNFa+LPS 


4.3 


Monocytes rest 


2.0 'Neutrophils rest 


0.0 


Monocytes LPS 


0.9 iCoIon 


9.2 


Macrophages rest 


3.8 ILung 


0.0 


Macrophages LPS 


3.3 iThymus 


36.1 


HUVEC none 


0.4 |Kidney 


99.3 
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starved 
Table AG. Panel 4D 



4.7 



Tissue Name 


Rel Exnf%^Ae809 
Run 152552244 


Tissue Name 


Rel. Exp.(%) Ag809, 
Run 152552244 


^f*i*onHjirv TTil act 


2 0 


HUYEC Il^lbeta 


1.2 


Secondary Th2 act 


1.5 


HUVEC IFN gamma 


1.4 


Secondary Trl act 


2.5 


HUVLC INr alpna + lrN 
gamma 


0.8 


Secondary Thl rest 


1.0 


HUVEC TNF alpha + IL4 


1.1 


Secondary Th2 rest 


3.0 


HUVEC IL-ll 


3.0 


Secondary Trl rest 




Lung Microvascular EC none 


0.8 


jnnrnary mi act 


0.4 


Lung Microvascular EC 
TNFalpha + IL-lbeta 


0 s 


jTimary mz act 


1.5 


Microvascular Dermal EC 
none 


1.1 


Primary Trl act 


2.0 


Microsvasular Dermal EC 
TNFalpha+IL-lbeta 


1 0 


Primary Thl rest 


5.4 


Bronchial epithehum 
TNFalpha + ILlbeta 


0.0 


Primary Th2 rest 


3.1 

- 


Small airway epithelium 
none 


0.4 


Primary Trl rest 


f. f. 'Small mrway epithelium 
|TNFalpha + IL-1 beta 


0.5 


CD45RA CD4 
lymphocyte act 


1 1 .2 iCoronery artery SMC rest 


5.8 


CD45RO CD4 
lymphocyte act 


IJZ 


Coronery artery SMC 
TNFalpha+IL-lbeta 


2.3 


CDS lymphocyte act 


0.9 


Astrocytes rest 


2.7 


Secondary CDS 
lymphocyte rest 


0.0 


Astrocytes TNFalpha + IL- 
Ibeta 


0.0 


Secondary CDS 
lymphocyte act 


0.6 


KU-812 (Basophil) rest 


6.8 


CD4 lymphocyte none 


1.1 


KU-812 (Basophil) 
PMA/ionomycin 


8.4 


2ry nil/rh2/Trl_anti- 
CD95 CHI 1 


0.0 


CCD! 106 (Keratinocytes) 
none 


1.6 


LAK cells rest 


0.5 


CCDl 106 (Keratinocytes) 
TNFalpha + IL-I beta 


1.4 


LAK cells IL-2 


0.0 


Liver cirrhosis 


42 


LAK cells IL-2+IL-12 


0.7 


Lupus kidney 


1.8 


LAK cells IL-2+1FN 
gamma 


I.l 


NCI-H292 none 


39.5 


LAK cells IL-2+IL-1S 


03 


NCI-H292IL-4 j39.0 


LAK cells 
PMA^onomycin 


0.0 


NCI-H292 IL-9 65.5 


NK Cells IL-2rest 


1.3 


NCI-H292IL-13 |37.1 
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Two Way MLR 3 day 


0,5 


NCI-H292 IFN gamma 


31.9 


Two Way MLR 5 day 


0.5 


HPAEC none 


0.5 


Two Way MLR 7 day 


2,6 


HPAEC TNF alpha + IL-1 

beta 


1.2 


PBMC rest 


0.0 


Lung fibroblast none 


42.3 


PBMCPWM 


1.3 


Lung fibroblast TNF alpha + 
IL-1 beta 


17.8 


PBMCPHA-L 


1.0 


Lung fibroblast IL-4 


100.0 


Ramos CQ cclY) none 


1.2 


Lung fibroblast lL-9 


72.7 


ionomvcin 


23 


Lung fibroblast IL-1 3 


60.7 


B lymphocytes PWM 


4.3 


Lung fibroblast IFN gamma 


81.8 


B lymphocytes CD40L 
and TT -4 


1.4 


Dermal iiDrot)iast CCDl u /U 


76.8 


EOL-1 dbcAMP 


72 


Dermal fibroblast CCD1070 
TNF ainha 


30.1 


FOT -1 Hhr AMP 


3.0 


Dermal fihmbla<?t CCD1 070 


38.2 


PMA/ionomycin 


IL-1 beta 


Dendritic cells none 


1.5 


Dermal fibroblast IFN 

gamma 


34.2 


Dendritic ceils LPS 


0.7 


Dermal fibroblast IL-4 


80.7 


Dendritic cells anti- 
CD40 


0.5 


IBD Colitis 2 


0.3 


Monocytes rest 


0.5 


IBD Crohn's 


1.4 


Monocytes LPS 


0.0 


Colon 


35.6 


Macrophages rest 


1.3 


Lung 


11.0 


Macrophages LPS 


1.7 


Thymus 


5.8 


HUVEC none 


2.3 


Kidney 


9.7 


HUVEC starved 


9.0 







General_screeiiin^_panel_vl.4 Summary: Ag809 Highest expression of the NOV2 gene is 
seen in a breast cancer cell line (CT=27.2). Significant expression is also seen in a cluster of 
cell lines derived from breast cancer, colon cancer and brain cancer. Thus, expression of this 
gene could be used to differentiate between these samples and other samples on this panel 
and as a marker to detect the presence of breast, colon, and brain cancer. Furthermore, 
therapeutic modulation of the expression or function of this gene may be effective in the 
treatment of breast, colon, and brain cancers. 

Panel 1.2 Summary: Ag809 Two experiment with the same probe and primer set 
produce results that are in excellent agreement, with highest expression of the NOV2 gene in 
a breast cancer cell line {CTs==26-27). In addition, significant expression is also seen in most 
cancer cell lines in this panel, including prostate, brain, colon, ovarian, liver and lung cancers. 
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Thus, expression of this gene could be used to differentiate between these sample and other 
samples on this panel and as a marker to detect tiie presence of cancer. Furthermore, 
therapeutic modulation of the expression or function of this gene may be effective in the 
treatment of prostate, brain, colon, ovarian, liver and lung cancers. 

Among tissues with metabolic function, this gene is expressed at moderate to low 
levels in pituitary, adrenal gland, pancreas, thyroid, skeletal muscle and adult and fetal heart 
and liver. This widespread expression among these tissues suggests that this gene product 
may play a role in norma! neuroendocrine and metabolic and that disregulated expression of 
this gene may contribute to neuroendocrine disorders or metabolic diseases, such as obesity 
and diabetes. 

This gene also exhibits moderate expression in the brain, especially in the 
hippocampus. The hippocampus is a region of specific neurodegeneration in Alzheimer's 
disease, that is thought to be mediated by the amyloid precursor protein processing enzyme, 
beta secretase. Beta secretase is a drug target of utility in the treatment of Alzheimer's 
disease. Since both this gene product and beta secretase are aspartyl proteases, the protein 
encoded by this gene may have potential utility as a drug target to treat Alzheimer's disease. 

References: 

Mallender WD, Yager D, Onstead L, Nichols MR, Eckman C, Sambamurti K, 
Kopcho LM, Marcinkeviciene J, Copeland RA, Rosenberry TL. Characterization of 
recombinant, soluble beta-secretase from an insect cell expression system. Mol Pharmacol 
2001 Mar;59(3):619-26 

The beta-site amyloid precursor protein-cleaving enzyme (BACE) cleaves the 
amyloid precursor protein to produce the N terminus of the amyloid beta peptide, a major 
component of the plaques found in flie brains of Alzheimer's disease patients. Sequence 
analysis of BACE indicates that the protein contains the consensus sequences found in most 
known aspartyl proteases, but otherwise has only modest homology with aspartyl proteases of 
known three-dimensional structure (/.e., pepsin, renin, or cathepsin D). Because BACE has 
been shown to be one of the two proteolytic activities responsible for the production of the 
Abeta peptide, this enzyme is a prime target for the design of therapeutic agents aimed at 
reducing Abeta for the treatment of Alzheimer's disease. Toward this ultimate goal, we have 
expressed a recombinant, truncated human BACE in a Drosophila melanogaster S2 cell 
expression system to generate high levels of secreted BACE protein. The protein was 
convenient to purify and was enzymatically active and specific for cleaving the beta-secretase 
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site of human APP, as demonstrated with soluble APP as the substrate in novel sandwich 
enzyme-linked immunosorbent assay and Western blot assays. Further kinetic analysis 
revealed no catalytic differences between this recombinant, secreted BACE, and brain BACE. 
Both showed a strong preference for substrates that contained the Swedish mutation, where 
NL is substituted for KM immediately upstream of the cleavage site, relative to the wild-type 
sequence, and both showed the same extent of inhibition by a peptide-based inhibitor. The 
capability to produce large quantities of BACE enzyme will facilitate protein structure 
determination and inhibitor development efforts that may lead to the evolution of useful 
Alzheimer's disease treatments. 

Panel 1.3D Summary: Ag809 Highest expression of the CG57107-01 gene is seen in fetal 
skeletal muscle (CT=29.6). This gene also has low levels of expression in thyroid, pituitary, 
adult and fetal heart, and adipose. This widespread expression in tissues of metabolic origin 
suggests that this gene product may be a small molecule target for the treatment of endocrine 
or metabolic disease, including thyroidopathies and obesity. 

Significant expression of this gene is also seen in brain, colon, lung and breast cancer 
cell lines as well as a melanoma cell line. This prominent expression in cancer cell lines is 
consistent with expression in Panels 1 2 and 2D. Therefore, expression of this gene could be 
used as a diagnostic marker for cancers of these tissues. Furthermore, therapeutic modulation 
of the gene product using antibodies and small molecule drugs may be used for the treatment 
of these cancers. 

This gene also shows low levels of expression in the CNS. Please see Panel 1 .2 for 
discussion of utility of this gene in the central nervous system. 

Panel 2D Summary: Ag809 The CG57107-01 gene is expressed at a higher level in 
prostate, ovarian and breast cancer compared to the adjacent normal tissues. Therefore, 
expression of this gene could be used as a diagnostic marker for the presence of these 
cancers. Furthermore, therapeutic inhibition of the gene product using antibodies and small 
molecule drugs may be useful for the treatment of these cancers. 

Panel 3D Summary: Ag809 Results from one experiment with the CG571 07-01 gene are 
not included. The amp plot indicates that there were experimental difficulties with this run. 

Panels 4D/4.1D Summary: Ag809 Significant expression of the CG571 07-01 gene is 
limited to fibroblast and NCI-H292 cells (CTs=31-33). Expression is also seen in normal 
thymus and kidney. Therefore, expression of this transcript or the protein it encodes could be 
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used as a marker for these tissues. In addition, therapeutics designed with fte protein encoded 
by this transcript could be used to regulate the expression of this putative enzyme. 

B. NOV3 - CG56936-01: Ribonuclease Pancreatic-like Protein 

Expression of the NOV3 gene was assessed using the primer-probe set Ag2477, 
described in Table BA. Results of the RTQ-PCR runs are shown in Tables BB and BC. 



Table BA . Probe Name Ag2477 



Primers 


Sequences 


Length 


Start Position 


SEQID 
NO: 


Forward 


5-ctgcaaccacatgatcatacaa-3* 


22 


273 


464 


Probe 


TET-5-atcagggaacctgaccacacttf 


^aa-3*-TAMRA 


26 


241 


465 


Reverse 


5'-atggatgaagacatgctccttt-3' 


22 


219 


466 



Table BB . Panel 1.3D 



Tissue Name 


ReL Exp.(%) Ag2477, Run 
165639391 


Tissue Name 


R^l Fvn 

Ag2477, Run 
165639391 


Liver adenocarcinoma 


0.0 


Kidney (fetal) 


0.0 


Pancreas 


0.0 


Renal ca. 786-0 


0.0 


Pancreatic ca. CAPAN 2 


0.0 


Renal ca. A498 


0.0 


Adrenal gland 


0.0 


Renal ca. RXF 393 


0.0 


Thyroid 


0.0 


Renal ca. ACHN 


0.0 


Salivary gland 


0.0 


Renal ca.UO-31 


0.0 


Pituitary gland 


0.0 


Renal ca.TK-10 


0.0 


Brain (fetal) 


0.0 


Liver 


0.0 


Brain (whole) 


0.0 


Liver (fetal) 


0.0 


Brain (amygdala) 


0.0 


Liver ca. (hepatoblast) 
HepG2 


0.0 


Brain (cerebellum) 


0.0 


Lung 


0.0 


Brain (hippocampus) 


0.0 


Lung (fetal) 


0.0 


Brain (substantia nigra) 


0.0 


Lung ca. (small cell) LX- 
1 


0.0 


Brain (thalamus) 


0.0 


Lung ca. (small cell) 
NCI-H69 


0.0 


Cerebral Cortex 


0.0 


Lung ca. (s.cell van) 
SHP-77 


18.8 


Spinal cord 


0.0 


Lung ca. (large cell)NCI- 
H460 


0.0 


glio/astro U87-MG 


0.0 


Lung ca. (non-sm. cell) 
A549 


0.0 


glio/astroU-118-MG 


0.0 


Lung ca. (non-s.cell) 

NCI-H23 


0.0 


astrocytoma SW1783 


0.0 


Lung ca. (non-s.cell) 


0.0 
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HOP-62 


neuro*; met SK-N-AS 


0.0 


Lung ca. (non-s.cl) NCI- 

H522 


0.0 


astrocytoma SF-539 


0.0 


Lung ca. (squatn.) SW 
900 


0.0 


asirocyioma oxnjd- / j 


n n 

v/.v 


Lung ca. (squam.) NCI- 

H596 


U.U 


glioma SNB-1 9 


0.0 


Mammary gland 


0,0 


glioma U251 


7.5 


Breast ca.* (pl.ef) MCF-7 J 


0.0 


glioma SF-295 


13.7 


Breast ca.* (pl.ef) MDA- 
MB-231 




Heart (fetal) 


0.0 


Breast ca.* (pl.ef) T47D 


0.0 


Heart 


0.0 


Breast ca. BT-549 


0.0 


Skeletal muscle (fetal) 


0.0 


Breast ca. MDA-N 


0.0 


Skeletal muscle 


0.0 , 


Ovary 


0.0 


Bone marrow 


0.0 


Ovarian ca. OVCAR-3 


0.0 


Thymus 


0.0 


Ovarian ca. OVCAR-4 


0.0 


Spleen 


0.0 


Ovarian ca. OVCAR-S 


0.0 


Lymph node 


0.0 


Ovarian ca. OVCAR-8 


0.0 


Colorectal 


0.0 


Kjvailall Ca. xKJmSXJ V - J 


n ft 


Stomach 


0.0 


v^varian ca. \^asciics^ 
SK-OV-3 


0.0 


Small intestine 


0.0 


Uterus 


0.0 


Colon ca. SW480 


0.0 


Jr IclL'CIUa. 


0 0 


Colon ca.* SW620(SW480 
met) 


0.0 


Prostate 


0.0 


Colon ca. HT29 


0.0 


met)PC-3 


0.0 


Colon ca. HCT-1 16 


0.0 


Testis 


100.0 


Colon ca. CaCo-2 


0.0 


IVlCiCUlUilla' nj\700\r\J, X 


ft ft 


Colon ca. 
tissue(OD03866) 


0.0 


Melanoma* (met) 
Hs688(B).T 


0.0 


Colon ca. HCC-2998 


0.0 


Melanoma UACC-62 


0.0 


Gastric ca.* (liver met) 
NCI-N87 


0.0 


Melanoma Ml 4 


0.0 


Bladder 


0.0 


Melanoma LOXIMVI 


0.0 


Trachea 


9.1 


Melanoma* (met) SK- 
MEL-5 


0.0 


Kidney 


0.0 


Adipose 


0.0 



Table BC . Panel 4D 



Tissue Name 


Rel.Exp.(%)Ag2477, 
Run 164391869 


Tissue Name 


Rel. Exp.(%)Ag2477, 
Run 164391869 


[Secondary Thl act 


0.0 


HUVEC IL-Ibeta 


0.0 


jSecondaiy Th2 act 


0.0 


HUVEC IFN gamma 


0.0 


[Secondary Trl act 


0.0 


HUVEC TNF alpha + IFN 


0.0 
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jgamma 




Secondary Thl rest 


0.0 


HUVEC TNF alpha + IL4 


0.0 


Secondary Th2 rest 


0.0 


HUVECIL-n 


0.0 


Secondary Trl rest 


0.0 


Lung Microvascular EC none 


0.0 


Primary Thl act 


35.8 


Lung Microvascular EC 

TTvTi? o1*xV»o -I- TT 1 Koto 

1 iMraipna + ijl- i oeia 


0.0 


Primary Th2 act 


0.0 


Microvascular I>ermal EC 
none 


0.0 


Primary Trl act 


0.0 


Microsvasular Dermal EC 
J Nr alpna + IL,- i beta 


0.0 


Primary Thl rest 


0.0 


Bronchial epithelium 
TNFalpha + ILl beta 


0.0 


Primary Th2 rest 


0.0 


Small airway epithelium 
none 


0.0 


Primary Trl rest 


0.0 


Small airway epithelium 
TNFalpha + IL-1 beta 


0.0 


CD45RA CD4 
lymphoc3^e act 


0.0 


Coronery artery SMC rest 


0.0 


CD45RO CD4 
lymphocyte act 


0.0 


Coronery artery SMC 
TNFalpha + IL-1 beta 


0.0 


CDS lymphocyte act 


0.0 


Astrocytes rest 


0.0 


Secondary CL/o 
lymphocyte rest 


0.0 


Astrocytes TNFalpha + IL- 
Ibeta 


0.0 


Secondary cjl>o 

Ivmnhocvte act 


0.0 


KU-812 (Basophil) rest 


0.0 


CD4 lymphocj^e none 


0.0 


KU-812 (Basophil) 


0.0 


2ry Thl/Th2yTrl anti- 
CD95CH11 


0.0 


v_x\_^jL/ 1 1 \J\j cLi.myjK'y ico j 

none 


0.0 


LAK cells rest 


0.0 


CCDl 106 CKeratinocvtes^ 
TNFalpha + 1L-Ibeta 


0.0 


LAK cells IL-2 


17.0 


Liver ciriiiosis 


100.0 


LAK cells IL-2+IL-12 


0.0 


Lupus kidney 


0.0 


LAK cells IL-2+IFN 
gamma 


0.0 


NCl-H292none 


0.0 


LAK ceUsIL-2+IL-.18 


0.0 


NCI-H292 IL-4 


0.0 


LAK cells 
PMA/ionomycin 


0.0 


NCI-H292 IL-9 




NK Ceils IL-2 rest 


0.0 


NCI-H292 IL-I3 


0.0 


Two Way MLR 3 day 


0.0 


NCI-H292 IFN gamma 


0.0 


Two Way MLR 5 day 


0.0 


HPAEC none 


0.0 


Two Way MLR 7 day 


0.0 


HPAEC TNF alpha + IL-1 
beta 


0.0 


PBMC rest 


0.0 


Lung fibroblast none 


0.0 


PBMCPWM 


0.0 


Lung fibroblast TNF alpha + 
IL-1 beta 


0.0 


PBMCPHA-L 


0.0 


Lung fibroblast lL-4 


0.0 
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Ramos (B cell) none 


0.0 


Lung fibroblast IL-9 


0.0 


Ramos (B cell) 
ionomycin 


0.0 


Lung fibroblast IL-13 


0.0 


B lymphocytes PWM 


0,0 


Lung fibroblast IFN gamma 


0.0 


r> iympnocyies k^l/hK}}^ 
and IL-4 


0.0 


L/ermai iiDroDiasi ka^uixj /u 
rest 


0.0 


EOL-1 dbcAMP 


0.0 


Dermal fibroblast CCD1070 ^ 
TNF alpha 


0.0 


EOL-1 dbcAMP 
PMA/ionomycin 


0.0 


Dermal fibroblast CCD 1070 
IL-1 beta 


0.0 


Dendritic cells none 




Dermal fibroblast IFN 
gamma 




Dendritic cells LPS 


0.0 


Dermal fibroblast IL-4 


0.0 


Dendritic cells anti- 
CD40 


0.0 


IBD Colitis 2 


22.8 


Monocytes rest 


0.0 


IBD Crohn's 


0.0 


Monocytes LPS 


0.0 


Colon 


0.0 


Macrophages rest 


0.0 


Lung 


0.0 


Macrophages LPS 


0.0 


Thymus 


8.0 


HUVECnone 


0.0 


Kidney 


0.0 


HUVEC starved 


0.0 







Panel 1.3D Summary: Ag2477 Significant expression of the NOV3 gene is restricted to the 
testis (CT=33.1). Thus, expression of this gene could be used to differentiate testis tissue 
from other tissues. Furthermore, the highly specific expression of the NOV3 gene suggests 
that its protein product may be involved in the normal function of the testis. Thus, therapeutic 
modulation of the expression or function of this gene may be useful in the treatment of 
infertility and other disorders that involve the testis. 

Panel 4D Summary: Ag 2477 The NOV3 transcript is expressed almost exclusively in liver 
cirrhosis (CT=33.5) but not in normal liver. The protein encoded for by this transcript may be 
involved or associated with the pathology of this tissue and may serve as a diagnostic marker 
for liver cirrhosis or other inflammatory liver diseases. 

C. NOV4 - CG51707-02: SERyTHR PROTECT KINASE 

Expression of the NOV4 gene was assessed using the primer-probe sets Ag2827 and 
Ag3274, described in Tables CA and CB. Results of the RTQ-PCR runs are shown in Tables 
CC, CD, and CE. 



Table CA . Probe Name Ag2827 
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Primers 


Sequences 


Length 


Start Position 


SEQID 
NO: 


Forward 


5*-actggtgctgaagatcatgagt-3* 


22 


670 


467 


Probe 
Reverse 


TET-5'-cacctttgcacctatctctgaccggt-3-TAMRA 


26 


694 


468 


5 -aggctccaggctgagtagact-3 ' 


21 


749 


469 



Table CB , Probe Name Ag3274 



Primers 


Sequences 


Length 


Start Position 


SEQ ID 

NO: 


Forward 


5-tacgagaacttcctggaagaca-3* 


22 


239 


470 


Probe 


TET-5'-aagcccttatgaccgccatggaatat-3-TAMRA 


26 


261 


471 


Reverse 


5'-attacagcgcttttggatgaa-3' 


21 


311 


472 



Table CC. Panel 1.3D 



Tissue Name 


ReL Exp.(%) Ag2827, Run 
165528176 


Tissue Name 


Rel. Exp.(%) 
Ag2827, Run 
165528176 


Liver adenocarcinoma 


31.9 


Kidney (fetal) 


51.4 


Pancreas 


17.9 


Renal ca. 786-0 


18.4 


Pancreatic ca. CAPAN 2 


10.8 


Renal ca. A498 


39.2 


Adrenal gland 


19.1 


Renal ca. RXF 393 


5.9 


Thyroid 


40.1 


Renal ca. ACHN 


24.7 


Salivary gland 


37.9 


Renal ca. UO-31 


14.7 


Pitiiitarv pland 


52.1 


Renal ca TK-10 


3.9 


Brain (fetal) 


3.7 


Liver 


12.1 


Brain (whole) 


7.2 


Liver (fetal) 


58.2 


Brain (amygdala) 


23.7 


Liver ca. (hepatoblast) 
HepG2 


95.9 


Brain (cerebellum) 


6.9 


Lung 


14.0 


Brain (hippocampus) 


16.7 


Lung (fetal) 


71^ 


Brain (substantia nigra) 


6.7 


Lung ca. (small cell) 
LX-1 


25.7 


Brain (thalamus) 


32.8 


Lung ca. (small cell) 
NCI-H69 


14.3 


Cerebral Cortex 


14.6 


Lung ca. (s.cell var.) 
SHP-77 


8.0 


Spinal cord 


22.1 


Lung ca. (large 
ceiI)NCI-H460 


45.1 


giio/astro U87-MG 


12.0 


Lung ca. (non-sm. cell) 
A549 


14.5 


glio/astroU-118-MG 


28.3 


Lung ca. (non-s.cell) 
NCI-H23 


21.2 


astrocytoma SW1783 


11.3 


Lung ca. (non-s.cell) 
HOP-62 


27.4 


neuro*; met SK-N-AS 


6.7 


Lung ca. (non-s.cl) NCI- 
H522 


6.0 


astrocytoma SF-539 


28.1 


Lung ca. (squam.) SW 


17.6 
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900 




astrocytoma SNB-75 


17.8 


Lung ca. (squam.) NCI- 
H596 


22.4 


glioma SNB-19 


12.9 


Mammary gland 


24.0 


glioma u25 1 


1 /A 


Breast ca.* (pl.ef) MCF- 
7 




glioma SF-295 


24,7 


Breast ca.* (pLef) 
MDA-MB-'231 


36.9 


Heart (fetal) 


0.0 


Breast ca.* (plef) T47D 


52.1 


Heart 


2.1 


Breast ca. BT-549 


21.2 


Skeletal muscle (fetal) 


9.2 


Breast ca. MDA-N 


0.0 


Skeletal muscle 


6.6 


Ovary 


2.3 


Bone marrow 


24.1 


Ovarian ca OVCAR-3 


7.6 


Thymus 


22.4 


Ovarian ca. OVCAR-4 


0.0 


Spleen 


13.6 


Ovarian ca OVCAR-5 


27.0 


Lymph node 


100.0 


Ovarian ca OVCAR-8 


4.5 


Colorectal 


7.1 


Ovarian ca IGROV-1 


6.0 


Stomach 


19.8 


Ovarium pa * /^JiQcitpQ^ 

SK-OV-3 


27.0 


Small intestine 


39.2 


Uterus 


24.7 


Colon ca. SW480 


93 


Placenta 


42.9 


SW620(SW480 met) 


14.5 


Prostate 


29.7 


Colon ca. HT29 


10.7 


Prostate ca.* (bone 
met)PC-3 


13.2 


Colon ca. HCT-116 


6.6 


Testis 


20.7 


Colon ca. CaCo-2 


20.4 


Melanoma Hs688(A),T 


7.2 


tissue(OD03866) 


6.9 


Melanoma* (met) L ^ 
Hs688(B).T 1 


Colon ca. HCC-2998 


33.7 


Melanoma UACC-62 


16.2 


Gastric ca.* (liver met) 
NCI-N87 


42.9 


Melanoma M14 


9.0 


Bladder 


35.4 


Melanoma LOXIM VI 


12.1 


Trachea 


48.3 


Melanoma* (met) SK- 
MEL-5 


10.0 


Kidney 


39.5 


Adipose 


4.2 



Table CD. Panel 2D 



Tissue Name 


Rel. E3q).(%) Ag2827, 
Run 162599361 


Tissue Name 


ReLExp.(%)Ag2827, 
Run 162599361 


Normal Colon 


38.4 


Kidney Margin 
8120608 


18.6 


CC Well toModDiff 
(OD03866) 


7.8 


Kidney Cancer 8120613 


6.0 


CC Margin (OD03866) 


9.6 


Kidney Margin 
8120614 


25.0 


CC Gr.2 rectosigmoid 


4.9 


Kidney Cancer 9010320 


11.2 
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(OD03868) 








CC Margin (OD03868) 


0.4 


Kidney Margin 
9010321 


24.7 


CC Mod Diff (ODO3920) 


49.0 


Normal Uterus 


3.2 


CC Margin (ODO3920) 


14.2 


Uterus Cancer 064011 


21.6 


CC Or.z ascend colon 
(OD03921) 


33.0 


Normal Thyroid 


22.5 


CC Margin (OD03921) 


6.0 


I nyroid Cancer 064010 


LZ.b 


CC from Partial 
Hepatectomy (ODO4309) 
Mets 


48.6 


Thyroid Cancer 
A302152 


30.1 


Liver Margin (ODO4309) 


19.2 


Thyroid Margin 
A302153 


39.0 


Colon mets to lung 
(OD04451-01) 


17.1 


Normal Breast 


21.9 


Lung Margin (OD04451-02) 


11.2 


Breast Cancer 
(OD04566) 


29.5 


Normal Prostate 6546-1 


100.0 


Breast Cancer 
(OCK)4590-01) 


82.9 


Prostate Cancer (OD04410) 


36.1 


Breast Cancer Mets 
(OD04590-03) 


76.3 


Prostate Margin (OD04410) 


47.0 


Breast Cancer 
Metastasis (OD04655- 


49.7 


01) 


39.5 


Breast Cancer 064006 


21.8 


Pro<state Maroin ^'0004720- 

02) 


49-7 


Breast Cancer 1024 


32.1 


Normal Lung 061010 


37.1 


Breast Cancer 9100266 


52.9 


Lung Met to Muscle 
(OD04286) 


11.6 


Breast Mar^ 9100265 


21.5 


Muscle Margin (OD04286) 


3.6 


Breast Cancer A209073 


15.4 


Lung Malignant Cancer 
(OD03126) 








Lung Margin (OD03 126) 


21.0 


Normal Liver 


12.8 


Lung Cancer (OEX)4404) 


9.5 


Liver Cancer 064003 


5.5 


Lung Margin (OD04404) 


20.0 


Liver Cancer 1025 


8.7 


Lung Cancer {OD04565) 


9.2 


Liver Cancer 1026 


11.4 


Lung Mar^n (OD04565) 


25.7 


Liver Cancer 6004-T 


7.9 


Lung Cancer (OD04237-01) 


40.6 


Liver Tissue 6004-N 


15.6 


Lung Margin (OD04237-02) 


17.1 


Liver Cancer 6005-T 


17.0 


Ocular Mel Met to Liver 
(ODO4310) 


22.8 


Liver Tissue 6005-N 


7.0 


Liver Margin (ODO4310) 


12.6 


Normal Bladder 


33.0 


Melanoma Mets to Lung 
(OD04321) 


103 


Bladder Cancer 1023 


6.6 


Lung Margin (OD04321) 


27.4 


Bladder Cancer 
A302173 


5.8 
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Normal Kidney 


66.9 


Bladder Cancer 
(OD04718-01) 


39.2 


Kidney Ca, Nuclear grade 2 
(OD04338) 


82.9 


Bladder Normal 
Adjacent (OD04718-03) 


13.6 


Kidney Margin (OD04338) 


54.0 


Normal Ovary 


5.7 


Kidney Ca Nuclear grade 
1/2(OD04339) 


93.3 


Ovarian Cancer 064008 


18.4 


Kidney Margin (OD04339) 


61.6 


Ovarian Cancer 
(OE)04768-07) 


39.5 


Kidney Ca, Clear cell type 


20.7 


Ovary Margin 
^0004768-08) 


6.3 


Kidney Mar^n (OD04340) 


48.6 


Normal Stomach 


123 


Kidney Ca, Nuclear grade 3 
(OD04348) 


28.9 


Gastric Cancer 9060358 


3.6 


Kidney Margin (OD04348) 


80.1 


Stomach Mar^n 
9060359 


12.6 


Kidney Cancer (OL)04622- 
01) 


9.5 


Gastric Cancer 9060395 


11.4 


Kidney Margin (OD04622- 
03) 


13.0 


Stomach Mai^in 
9060394 


14.1 


Kidney Cancer (OD04450- 
01) 


57.8 


Gastric Cancer 9060397 


22.5 


Kidney Margin (OD04450- 
03) 


38.2 


Stomach Margin 
90603% 


10.5 


Kidney Cancer 8120607 


33.4 


Gastric Cancer 064005 


20.0 



Table CE. Panel 4D 



Tissue Name 


Re!.Exp.(%)Ag2827, 
Run 162294650 


Tissue Name 


Rel. Exp.(%) 
Ag2827, Run 
162294650 


Secondary Thl act 


3.1 


HUVEC IL-lbeta 


0.7 


Secondary Th2 act 


7.8 


HUVEC IFN gamma 


3.1 


Secondary Trl act 


7.0 


HUVEC TNF alpha + IFN 
gamma 


8.8 


Secondary Thl rest 


2.7 


HUVEC T^sfF alpha + IL4 


5.0 


Secondary Th2 rest 


3.6 


HUVECIL-ll 


2.7 


Secondary Trl rest 


2.5 


Lung Microvascular EC none 


8.1 


Primary Thl act 


9.1 


Lung Microvascular EC 
TNFalpha + IL-lbeta 


3.8 


Primary Th2 act 


12.2 


Microvascular Dermal EC none 


7.5 


Primary Trl act 


6.4 


Microsvasular Dermal EC 
TNFalpha + IL-lbeta 


8.5 


Primary Thl rest 


16.5 


Bronchial epitheliimi TNFalpha 
-MLlbeta 


10.7 


Primary Th2 rest 


6.7 


Small airway epithelium none 


4.7 


Primary Trl rest 


7.1 


Small airway epithelium 
TNFalpha + IL-lbeta 


11.8 


CD45RA CD4 


8.0 


Coronery artery SMC rest 


3.8 
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lymphocyte act 








CD45RO CD4 
lymphocyte act 


6.9 


Coronery artery SMC 
TNFalpha + IL-lbeta 


3.6 


CDS lymphocyte act 


7.6 


Astrocytes rest 


3.3 


oeconuary *^U5 
lymphocyte rest 


11.7 


Astrocytes TNFalpha + IL- 
Ibeta 


5.1 


oecoiiUary k^l/o 
lymphocyte act 


9.8 


KU-812 (Basophil) rest 


9.0 


CD4 lymphocyte none 


4.6 


KU-8 12 (Basophil) 
PMA/ionomycin 


16.7 


2ry Thl/Th2/Trl anti- 
CD95 CHll 


8.0 


CCDl 106 (Keratinocytes) none 


9.7 


LAK cells rest 


7.3 


CCD1106 (Keratinocytes) 
TNFalpha + IL-lbeta 


7.6 


LAK cells IL-2 


9.3 


Liver cirrhosis 


5.0 


LAK cells IL-2+IL-12 


10.7 


Lupus kidney 


2.6 


LAK cells 1L-2+IFN 
gamma 


17.8 


NCI-H292 none 


20.3 


LAK cells IL-2+ IL-18 


13.4 


NCI-H292 IL-4 


20.3 


LAK cells 
PMA/ionomycin 


9.2 


NCI-H292 IL-9 


31.2 


NK Cells IL-2 rest 


9.6 


NCI-H292 IL^13 


22.8 


Two Way MLR 3 day 


15.0 


NCI-H292 IFN gamma 


27.7 


Two Way MLR 5 day 


7.0 


HPAEC none 


2.5 


Two Way MLR 7 day 


3.7 


HPAEC TNF alpha + IL-1 beta 


4.6 


PBMC rest 


7.4 


Lung fibroblast none 


6.6 


PBMC PWM 


27.5 


Lung fibroblast TNF alpha + 
IL-1 beta 


2.6 


PBMC PHA-L 


1L6 


Lung fibroblast IL-4 


13.2 


Ramos (B cell) none 


39.2 


Lung fibroblast IL-9 


6.6 


Ramos (B cell) ionomycin 


100.0 


Luns fibroblast lL-13 


3.1 


B lymphocytes PWM 


33.9 


T lino- fihroHlA^t TFlvJ oamm?^ 


7.1 


B lymphocytes CD40L 
andlL-4 


49.0 


Dermal fibroblast CCD 1 070 

■LVOV'S XXKJX \/KfM\JUyt. >_/V^JL/l \J / \J 

rest 


4.2 


liV/L»-l dOCAMr 


A 

31 A 


Dermal fibroblast CCDl 070 
TNF alpha 


8.0 


EOL-1 dbcAMP 

PMA/ionomycin 


6.2 


Dermal fibroblast CCD1070 IL- 
1 beta 


2.3 


Dendritic cells none 


3.4 


Dermal fibroblast IFN gamma 


4.2 


Dendritic cells LPS 


3.1 


Dermal fibroblast IL-4 


9.5 


Dendritic cells anti-CD40 


2.8 


IBD Colitis 2 


0.8 


Monocytes rest 


11.2 


IBD Crohn*s 


3.2 


Monocytes LPS 


11.7 


Colon 


15.2 


Macrophages rest 


8.3 


Lung 


7.9 


Macrophages LPS 


8.4 


Thymus 


25.3 


HUVEC none 


2.3 


Kidney 


7.3 



409 



HUVEC starved 173 



CNS_neurodegeneration_vl.O Summary: Ag2827/Ag3274 Expression of the NOV4 gene 
is low/undetectable in all samples on this panel (CTs>35). (Data not shown.) The amp plot 
indicates that the experiment with the probe and primer set Ag3274 shows high probability of 
a probe failure. 

Panel 13D Summary: Ag2827 Expression of the NOV4 gene is restricted to lymph node 
and a liver cancer cell line (CTs=34). Thus, expression of the NOV4 gene could be used to 
differentiate between these samples and other samples on this panel and as a marker for 
lymph tissue and liver cancer. A second experiment with the probe/primer set Ag31 10 shows 
low/undetectable levels of expression in all samples on this panel {CTs>35). (Data not 
shown.) 

Panel 2D Summary: Ag2827 Highest expression of the NOV4 gene is seen in normal 
prostate (CT=31). Significant expression is also seen in normal colon and a cluster of breast 
cancer cell lines. Thus, expression of the NOV4 gene could be used to differentiate between 
these samples and other samples on this panel. 

Panel 4D Summary: Ag2827 Widespread expression of the NOV4 gene is seen in this 
panel, with highest expression in the B cell line Ramos treated with ionomycin (CT=30.8), 
This transcript encodes a kinase-like molecule with potential signaling activity and thus may 
be important in maintaining normal cellular functions in a number of tissues. Therefore, 
therapies designed with the protein encoded by this transcript may be important in regulating 
cellular viability or function. 

D. NOV6 - CG56684-02: Glycodelin 

Expression of the NOV6 gene was assessed using the primer-probe sets Ag2994 and 
Ag2974, described in Tables DA and DB. Results of the RTQ-PCR runs are shown in Table 
DC. 

Table DA . Probe Name Ag2994 



Primers 


Sequences {Length 


Start Position 


SEQIDNO: 1 


Forward 


5-acaaggtcatggaggaattcat-3' 122 


454 


473 1 


Probe 


TET-5'-agctttctcaggaccctgcccgt-3*-TAMRA}23 


477 


474 ] 


Reverse 


5-tgggtaacgtccaggaagat-3' |20 


510 


475 1 
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Table DB . Probe Name Ag2974 



Primers 


Sequences 


LengthlStart Position 


SEQIDNO: 


Forward 


5*-acaaggtcatggaggaattcat-3* 


22 454 


476 


Probe 


TET-5'-agctttctcaggaccctgcccgt-3-TAMRA 


23 1477 


477 


Reverse 


5-tgggtaacgtccaggaagat-3* 


20 jsiO 


478 



Table DC. Panel 4D 



Tissue Name 


Rel. 

Exp.(%) 
Ag2974, 
Run 

16452110 

A 


ReL 

Exp.(%) 
Ag2974, 
Run 

16453559 


ReL 

Exp.(%) 
Ag2994, 
Run 

16440407 


Tissue Name 


Rel. 

Exp.(%) 
Ag2974, 
Run 

16452110 

A 


Rel. 

Exp.(%) 
Ag2974, 
Run 

16453559 


Rel. 

Exp.(%) 
Ag2994, 
Run 

16440407 


Secondary Thl act 


0.0 


0.0 


0.0 


HUVEC IL-lbeta 


0.0 


0.0 


0 

.0 


Secondary Th2 act 


0.0 


0.0 


0.0 


HUVECIFN 
gamma 


0.0 


0.0 


0 

.0 


Secondary Trl act 


0.0 


0.0 


0.0 


HUVEC TNF alpha 
+ IFN gamma 


0.0 


0.0 


0 

.0 


Secondary Thl 
rest 


0.0 


0.0 


0.0 


HUVEC TNF alpha 
+ IL4 


0.0 


0.0 


0 

.0 


Secondary Th2 
rest 


0.9 


0.0 


0.0 


HUVEC IL-11 


0.0 


18.8 


0 

.0 


Secondary Trl 
rest 


0.0 


0.0 


0.0 


Lung Microvascular 
EC none 


27.5 


48.0 


1 

1.6 


Primary Thl act 


0.0 


0.0 


0.0 


Lung Microvascular 
EC TNFalpha + IL- 

Ibeta 


0.7 


0.0 


0 

.0 


Primary Th2 act 


0.0 


0.0 


0.0 


Microvascular 
Dmnal EC none 


0.0 


13.6 


2 

3.3 


Primary Trl act 


0.0 


0.0 


0.0 


Microsvasular 
Dermal EC 
TNFalpha + IL- 
lbeta 


12.9 


32.3 


1 

0.7 


Primary Thl rest 


0.0 


0.0 


0.0 


Bronchial 
epithelium 
TNFalpha +IL1 beta 


0.0 


0.0 


0 

.0 


Primary Th2 rest 


0.0 


0.0 


0.0 


Small airway 
epithelium none 


0.0 


0.0 


0 

.0 
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Primaiy Trl rest 


0.0 


0.0 


0.0 


Small airway 
epithelium 
TNFalpha + IL- 
Ibeta 


0.0 


0.0 


0 

.0 


CD45RA CD4 


0.0 


0.0 


0.0 


Coronery artery 


0.0 


0.0 


0 

.u 


CEM5RO CD4 
lympnocyic act 


0.0 


0.0 


0.0 


Coronery artery 
SMCTNFalpha + 
IL-lbeta 


0.0. 


0.0 


0 


CDS lymphocyte 

3Ct 


0.0 


0.0 


0.0 


Astrocytes rest 


0.0 


40.1 


0 


Secondary CD8 
lynipnocyie rest 


0.0 


0.0 


0.0 


Astrocytes 
TNFalpha+ IL- 
lbeta 


0.0 


0.0 


0 


Secondary CDS 
lymphocyte act 


0.0 


0.0 


0.0 


KU-812 (Basophil) 
rest 


0.0 


0.0 


0 

.0 


CD4 lymphocyte 

none 


0.0 


0.0 


0.0 


KU-812 (Basophil) 

PMA/ionomycin 


0.0 


0.0 


0 

.0 


2ry 

Thl/Th2/Trl_anti- 
CD95 CHll 


0.0 


0.0 


0.0 


CCDl 106 

(Keratinocytes) 

none 


0.0 


0.0 


0 

.0 


LAK cells rest 


0.0 


0.0 


0.0 


CCDl 106 
(Keratinocytes) 
TNFalpha+ IL- 
lbeta 


0.0 


0.0 


0 

.0 


LAK cells IL-2 


0.0 


0.0 


0.0 


Liver cirrhosis 


100.0 


100.0 


9 


LAK cells IL- 


0.0 


0.0 


0.0 


Lupus kidney 


0.0 


0.0 


0 


LAK cells IL- 


0.0 


0.0 


0.0 


NCI-H292 none 


0.0 


0.0 


0 


LAK cells IL-2+ 
IT -18 


0.0 


0.0 


0.0 


NCI-H292 IL-4 


0.0 


0.0 


0 

A 


LAK cells 
PMA/ionomycin 


0.0 


0.0 


0.0 


Na-H292 IL-9 


0.0 


0.0 


0 

.0 


NK Cells lL-2 rest 


0.0 


0.0 


0.0 


NCI-H292 IL-13 


0.0 


0.0 


0 

.0 
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Two Way MLR 3 
day 


0.0 


0.0 


0.0 


NCI-H292 IFN 

gdlillila 


0.0 


0.0 


0 


Two WayMLRS 
day 


0.0 


0.0 


0.0 


HPAEC none 


0.0 


0.0 


0 

.u 


TwoWayMLR? 
day 


0.0 


0.0 


0.0 


HPAEC TNF alpha 

-J- TT 1 Koto 

« iLj" 1 oeia 


0.0 


26.2 


4 


PBMC rest 


0.0 


0.0 


0.0 


Limg fibroblast 
none 


0.0 


0.0 


0 

A) 


PBMC PWM 


0.0 


0.0 


0.0 


Lung fibroblast 
TNF alpha -ML-l 
beta 


0.0 


0.0 


0 

,0 


PBMCPHA-L 


0.0 


0.0 


0.0 


Lung fibroblast IL-4 


0.0 


0.0 


0 

.0 


Ramos (B ceil) 
none 


0.0 


19.1 


0.0 


Lung fibroblast IL-9 


0.0 


0.0 


0 

.0 


Ramos (B cell) 
ionomycin 


6.4 


11.7 


27.2 


Lung fibroblast IL- 

13 


53 


0.0 


0 

.0 


B lymphocytes 
Jr WM 


0.0 


0.0 


0.0 


Lung fibroblast IFN 
gamma 


0.0 


0.0 


0 

.0 


B lymphocytes 

anu ll-r-4 


0.0 


0.0 


0.0 


Dermal fibroblast 
K^K^u 1 u /u rest 


0.0 


0.0 


0 

.0 


EOL-l dbcAMP 


0.0 


0.0 


0.0 


Dermal fibroblast 
CCD1070 TNF 
alpha 


0.0 


0.0 


0 

.u 


EOL-1 dbcAMP 
riviA/ionomycin 


0.0 


0.0 


0.0 


Dermal fibroblast 
v^K^ij 1 u / u 11^- 1 Deta 


0.0 


0.0 


0 

.0 


Dendritic cells 
none 


0.0 


0.0 


0.0 


Dermal fibroblast 

irlN goiiuila 


0.0 


0.0 


0 


Dendritic cells 
LPS 


0.0 


0.0 


0.0 


Dermal fibroblast 
IL-4 


0.0 


0.0 


A 

u 

.0 


Dendritic cells 
anti-CD40 


0.0 


0.0 


0.0 


IBD Colitis 2 


0.0 


17.3 


0 

.0 


McMiocytes rest 


0.0 


0.0 


0.0 


IBD Crohn's 


0.0 


0.0 


0 
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A 
•V 


Monocytes LPS 


0.0 


0.0 


0.0 


Colon 


60.7 


96.6 


8 


Macrophages rest 


0.0 


0.0 


0.0 


Lung 


14.9 


41.8 


1 

HA A 


Macrophages LPS 


0.0 


0.0 


0.0 


Thymus 


0.0 


15.3 


1 

2.1 


HUVEC none 


0.0 


0.0 


0.0 


Kidney 


0.0 


0.0 


0 

.0 


HUVEC starved 


0.0 


0.0 


0.0 







CNS__neurodegeneration_vl.O Summary: Ag2994 Expression of the NOV6 gene is 
low/undetectable in all samples on this panel (CTs>35). (Data not shown.) 

Panel 1.3D Summary: Ag2994 Expression of the NOV6 gene is low/undetectable in all 
samples on this panel (CTs>35). (Data not shown.) Results from a second experiment with 
the probe/primer set Ag2974 are not included. The amp plot indicates that there were 
experimental difficulties with this run. 

Panel 4D Summary: Ag2974/Ag2994 Three experiments with the same probe and primer 
set ail show significmit expression of the NOV6 gene restricted to colon, lung, and liver 
cirrhosis. Thus, expression of the NOV6 gene could be used as a marker for colon and lung 
tissue and liver cirrhosis. Furthermore, expression of this gene is decreased in colon samples 
from patients with IBD colitis and Crohn's disease relative to normal colon. Therefore, 
therapeutic modulation of the activity of the protein encoded by this gene may be useful in 
the treatment of inflammatory bowel disease. In addition, antibodies or small molecule 
therapeutics may reduce or inhibit fibrosis that occurs in liver cirrhosis. 

£• NOV7 - CG56977-01: Neuropathy target esterase/swiss cheese 

Expression of NOV7 gene was assessed using the primer-probe sets Ag3055 and 
Ag3061 , described in Tables EA and EB. Results of the RTQ-PCR runs are shown in Tables 
EC, ED, EE and EF. 



Table EA . Probe Name Ag3055 
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Primers 


Sequences 


Length|start Position 


SEQIDNO: 


Forward 


5-cctcatccttttcatgttcaga-3* 


22 181 


479 


Probe 


TET-5-actcctcagtaccggttccggaagag-3'-TAMRA 


26 jl33 


480 


Reverse 


5-gccgtaaaacatcactttgtct-3' 


22 1159 


481 



Table EB > Probe Name Ag3061 



Primers 


Sequences 


Length|Start Position 


SEQIDNO: 


Forward 


5'-cctcatccttttcatgttcaga-3* 


22 181 


482 


Probe 


TET-5'-actcctcagtaccggttccggaagag-3-TAMRA 


26 1133 


483 


Reverse 


5-gccgtaaaacatcactttgtct-3' 


22 1159 


484 



Table EC. Panel 1.3D 



Tissue Name 


ReK 

Ag3055, 
Run 

167985388 


RpI Vtm 

Ag3061,Run 
167960032 


Tissue Name 


Rel. 

Ag3055, 
Run 

167985388 


Rel. 
Fxn 

Ag306i, 
Run 

167960032 


Liver 

adenocarcinoma 


3.8 


1L8 


Kidney (fetal) 


25.3 


39.0 


Pancreas 


22,1 


20.6 


Renal ca. 786-0 


2.2 


3.6 


Pancreatic ca. 
CAPAN 2 


2.1 


2.8 


Renal ca. A498 


3.1 


5.5 


Adrenal gland 


12.9 


9.7 


Renal ca. RXF 393 


2.6 


4.6 


i 1 iiy I \Ji\j 


1 1 7 


'117 
11./ 






1 4 


Salivary gland 


6.8 


11.2 


Renal ca. UO-31 


0.6 


0.6 


Pituitary gland 


18.0 


20.0 


Renal ca. TK-10 


1.9 


5.0 


Bram (letai) 


3U.J 


15 J 


Liver 




4.1 


Brain (whole) 


25.0 


41.5 


Liver (fetal) 


2.7 


7.6 


Brain (amygdala) 


193 


37.9 


Liver ca. (hepatoblast) 
HepG2 


1.9 


4.6 


Brain (cerebellum) 


15.7 


24.3 


Lung 


3.6 


63 


Brain (hippocampus) 


26.6 


34.4 


Lung (fetal) 


18.3 


22.8 


Brain (substantia 
nigra) 


13.5 


18.6 


Lung ca. (small cell) LX-1 


2.6 


8.9 


Brain (thalamus) 


17.0 


9.4 


Lung ca. (small cell) NCI- 
H69 


3.5 


4.1 


Cerebral Cortex 


36.6 


55.1 


Lung ca. (s.cell var.) SHP- 
77 


2.3 


63 


Spinal cord 


9.7 


19.6 


Lung ca. (lai^e cell)NCI- 
H460 


0.4 


0.0 


glio/astro U87-MG 


8.9 


22 


Lung ca. (non-sm. cell) 
A549 


5.1 


9.0 


glio/astroU-118-MG 


1.2 


0.4 


Lung ca. (non-s.ceIl) NCI- 

H23 


2.6 


4.9 


astrocytoma 
SW1783 


3.7 


5.6 


Lung ca. (non-s.cell) HOP- 

62 


1.2 


1.4 


neuro*; met SK-N- 


4.0 


7.3 


Lung ca. (non-s.cl) NCI- 


8.6 


9.1 
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AS 


1 jH522 






astrocytoma SF-539 


1.1 


0.8 } 


Lung ca. (squam.) SW 900 


1.6 


2.2 


oSirocyioiiia. oino- / j 


2.8 


5.0 1 


Lung ca. (squam.) NCI- 
H596 


2.1 


52 


glioma SNB-1 9 


1.4 


2.8 


Mammary gland 


11.4 


10.7 


glioma U251 


4.8 


6.3 


Breast ca.* (pl.ef) MCF-7 


19.6 


23.0 


glioma or-z5/j 


5.1 


9.4 


Breast ca.* (pLef) MDA- 
MB-231 


0.4 


2.0 


Heart (fetal) 


37.6 


43.8 


Breast ca.* (pl.ef) T47D 


100.0 


100.0 


Heart 


7.6 


15.5 


Breast ca. BT-549 


03 


0.7 


Skeletal muscle 
(fetal) 


37.4 


44.4 


Breast ca. MDA-N 


5.0 


8.6 


Skeletal muscle 


32.8 


48.3 


Ovary 


12.3 


11.5 


Bone marrow 


1.8 


1.8 


Ovarian ca OVCAR-3 


1.6 


2.8 


Thymus 


17.6 


12.4 


Ovarian ca. OVCAR-4 


0.0 


0.0 


Spleen 


5.8 


11.0 


Ovarian ca OVCAR-5 


12.2 


8.1 


Lymph node 


20.2 


42.3 


Ovarian ca OVCAR-8 


1.9 


2.7 


Colorectal 


111 
11.1 


IQ. / 


V_/V<lil£lll ^CL. IVJXvV.' V —J 


0.5 


0.0 


Stomach 


20.6 


22.8 


Ovarian ca * ^a^icites^ SK- 

OV-3 


6.6 


4.9 


Small intestine 


6.5 


6.2 


Uterus 


24.7 


36.1 


Colon ca. SW480 


9 1 






2.5 


1.1 


Colon ca.* 
SW620(SW480 met) 


6.1 


11.7 


Prostate 


1S.8 


28.3 


Colon ca. HT29 


1.7 


3.2 


met)PC-3 


4.2 


2.8 


Colon ca.HCT-116 


1.8 


12 


Testis 


5.4 


8.6 


Colon ca. CaCo-2 




4 7 


Melanoma Hs688rA'> T 


0.6 


0.4 


Colon ca. 
tissue(OD03866) 


1.1 


2.2 


Melanoma* (met) 
Hs688(B).T 


1.8 


2.3 


Colon ca. HCC-2998 


3.9 


5.7 


Melanoma UACC-62 


4.2 


3.0 


Gastric ca.* (liver 
met)NCI-N87 


4.8 


ll.l 


Melanoma M14 


0.5 


3.5 


Bladder 


24.0 


28.9 


Melanoma LOXEMVl 


0.9 


0.9 


Trachea 


8.4 


8.4 


Melanoma* (met) SK- 
MEL-5 


4.8 


1.9 


Kidney 


12.0 


11.7 


Adipose 


22.1 


27.2 



Table ED . Panel 2.2 



Tissue Name 


Rel. Exp.(%) Ag3055, 
Run 173763014 


Tissue Name 


Rei.Exp.(%)Ag3055, 
Run 173763014 


Normal Colon 


18.9 


Kidney Margin 
(OD04348) 


56.6 


Colon cancer (OD06064) 


0.0 


Kidney malignant cancer 
(OD06204B) 


0.0 


Colon Margin (OD06064) 


1.7 


Kidney normal adjacent 
tissue (OD06204E) 


8.7 
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Colon cancer (OD06159) 


0.0 


(OD04450-01) 


3.7 


Colon Mar^n (OD06159) 


17.0 


ICiHnpv A/farp^in 

(OD04450-03) 


15.1 


Colon cancer (OD06297- 
04) 


1.1 


Kidney Cancer 8120613 


2.5 


Colon Margin (OD06297- 
05) 


13.8 


Kidney Margin 8120614 


13.7 


CC Gr.2 ascend colon 
(OD03921) 


5.0 


Kidney Cancer 9010320 


1.2 


CC Margin (OD03921) 


1.9 


Kidney Margin 9010321 


5.1 


Colon cancer metastasis 
(OD06104) 


0.0 


Kidney Cancer 8120607 


5.5 


Lung Margin (OD06 1 04) 


2.0 


Kidney Margin 8120608 


2.9 


Colon mets to lung 
(OD04451'-01) 


2.1 


Normal Uterus 


24.1 


Lung Margin (OEX)445 1 - 
02) 


1L7 


Uterine Cancer 06401 1 


19.3 




32.3 


Normal Thyroid 


5.3 


Prostate Cancer 
rOD04410) 


12.2 


Thyroid Cancer 06401 0 


12.2 


Prn^tatp N/fpiro'iri 
(OD04410) * 


29.5 


Thyroid Cancer A302152 


37.1 


Tvlormal Ovarv 


10.1 


Thyroid Margin A302153 


5.0 


Ovarian cancer 
(OD06283-03) 


0.0 


Normal Breast 


35.6 


Ovarian Margin 
(OEX)6283-07) 


2.4 


Breast Cancer (OD04566) 


8.4 


Ovarian Cancer 064008 


8.8 


Breast Cancer 1024 


33.7 


Ovarian cancer 
(OD06145) 


14.5 


csreasi v^ancer ^v^iA/^jyu- 
01) 


18.4 


Ovarian Margin 
(OD06145) 


16.0 


Breast Cancer Mets 
(OD04590-03) 


19.3 


Ovarian cancer 
(OD06455-03) 


4.4 


Breast Cancer Metastasis 
(OD04655-05) 


33.0 


Ovarian Margin 
(OD06455-07) 


5.4 


Breast Cancer 064006 


22.2 


Normal Lung 


15,7 


Breast Cancer 9100266 


5.6 


Invasive poor diff. lung 
adeno (ODO4945-01 


10.5 


Breast Margin 9100265 


7.6 


Lung Margin (OD04945- 
03) 


24.5 


Breast Cancer A209073 


2.2 


Lung Malignant Cancer 
(OD03126) 


1.5 


Breast Margin A2090734 


41.2 


Lung Margin (OD03 1 26) 


10.6 


iBreast cancer (OD06083) 


17.8 


Lung Cancer (OD05014A) 


3.1 


Breast cancer node 
metastasis (OD06083) 


17.4 


Lung Margin (OD05014B) 


17.9 


Normal Liver 


36.1 
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Lung cancer (OE)06081) 


L8 


Liver Cancer 1026 


2.1 


Lung Margin (OD06081) 


5.7 


Liver Cancer 1025 


37.4 


Lung Cancer ^<JJLaI4Z3 /- 

on 


0.0 


Liver Cancer 6004-T 


22.4 


Lunff Marein rOD04237- 
02) 


10.0 


Liver Tissue 6004-N 


12.0 


Ocular Melanoma 
Metastasis 


12.8 


Liver Cancer 6005-T 


6.5 


Ocular Melanoma Mar^ 
(Liver) 


12.2 


Liver Tissue 6005-N 


42.9 


Melanoma Metastasis 


4.5 


Liver Cancer 064003 


9.3 


IVlwiuilV'llICl IVlCLi^lil Y^j_/Uiig,J 


4.8 


Normal Rladder 


22.5 


Normal Kidney 


18.3 


Bladder Cancer 1 023 


5.3 


Kidney Ca, Nuclear grade 
2 (OD04338) 


44.1 


Bladder Cancer A302173 


1.0 


Kidney Margin 
(OD04338) 


15.2 


Normal Stomach 


60.3 


Kidney Ca Nuclear grade 
1/2(OD04339) 


100.0 


Gastric Cancer 9060397 


0.0 


Kidney Margin 
(OD04339) 


21.6 


Stomach Margin 90603% 


11.0 


Kidney Ca, Clear cell type 
(OD04340) 


16.3 


Gastric Cancer 9060395 


8.3 


Kidney Margin 
(Op04340) 


19.6 


Stomach Margin 9060394 


14.2 


Kidney Ca, Nuclear grade 
3 (OD04348) 


1.3 


Gastric Cancer 064005 


8.1 



Table EE . Panel 4D 



Tissue Name 


Rel. Exp.(%) 
Ag3055, Run 
164317260 


Rel. Exp.(%) 
Ag3061,Run 
164528813 


Tissue Name 


Rel. Exp.(%) 
Ag3055, Run 
164317260 


Rel.Exp.(%) 
Ag3061,Run 
164528813 


Secondary Thl act 


4.7 


4.7 


HUVEC IL-lbeta 


1.3 


0.4 


Secondary Th2 act 


9.2 


7.7 


HUVECIFN 
gamma 


7.0 


7.3 


Secondary Trl act 


6.3 


8.3 


HUVEC TNF alpha 
+ IFN gamma 


2.9 


5.4 


Secondary Thl rest 


37.1 


26.1 


HUVEC TNF alpha 
+ IL4 


3.2 


3.1 


Secondary Th2 rest 


27.9 


29.1 


HUVECIL-11 


4.7 


6.0 


Secondary Trl rest 


42.3 


36.6 


Lung Microvascular 
EC none 


3.2 


8.1 


Primary Thl act 


12.1 


5.9 


Lung Microvascular 
ECTNFalpha + IL- 
Ibeta 


5.0 


7.2 


Primary Th2 act 


11.8 


6.1 


Microvascular 
Dermal EC none 


6.0 


4.9 


Primary Trl act 


7.9 


4.9 


Microsvasular 


1.6 


3.8 
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Dermal EC 
TNFalpna + IL- 
Ibeta 






Primary Thl rest 


52.1 


46.7 


Bronchial 
epithelium 
TNFalpha + 
ILlbeta 


1.4 


6.0 


Primary Th2 rest 


64.6 


42.6 


Small airway 
epithelium none 


0.5 


0.5 


Primary Trl rest 


32.5 


48.0 


Small airway 
epithelium 

XKTPoIt^Vici -I- TT 

iiNruipna » iju- 
Ibeta 


4.5 


3.5 


CD45RA CD4 
lymphocyte act 


3.6 


3.8 


Coronery artery 
oML^ rest 


4.1 


4.6 


CD45RO CD4 
lymphocyte act 


o.l 


1 1 51 
1 1 .5 


Coronery artery 
SMC TNFalpha + 
IL- Ibeta 


2.2 




CDS lymphocyte act 


/.I 


111 


Astrocytes rest 


5.6 


o n 


Secondary CDS 
lymphocyte rest 


5.0 


10.0 


Astrocytes 
TNFalpha + IL- 
lbeta 


4.2 


3.8 


Secondary CDS 
lymphocyte act 


9.9 


9.3 


KU-812 (Basophil) 

rest i 


5.6 


CD4 lymphocyte 
none 


20.0 


12.S 


KU-812 (Basophil) 
PMA/ionomycin | 


4.1 


2ry 

Thl/Th2/Trl_anti- 
Ci>y5 CHI 1 


81.2 


100.0 


CCD1106 

(Keratinocytes) 

none 


3.8 


4.8 


LAK cells rest 


113 


7.9 


CCD1106 
(Keratinocytes) 
TNFalpha + IL- 
lbeta 


0.0 


0.0 


LAK cells IL-2 


13.5 


12.7 


Liver cirrhosis 


7.0 


8.5 


J-rAlV CeilS ll-r-ZT^lJ-»- 

12 


5.1 


3.2 


Lupus kidney 


5.1 


5.9 


LAK cells IL- 
24-IFN gamma 


9.0 


7.7 


NCLH292 none 


21.6 


21.5 


LAK cells IL-2+IL- 
18 


8.7 


6.3 


NCI-H292 IL-4 


15.9 


10.4 


LAK cells 
PMAflonomycin 


2.8 


3.3 


NCI-H292 IL-9 


24.0 


19.2 


NK Cells IL-2 rest 


8.8 


17.0 


NCI-H292IL-13 


ILl 


16.5 


Two Way MLR 3 
day 


12.1 


10.4 


NCI-H292 IFN 
gamma 


8.5 


5.3 


Two WayMLRS 
day 


1.5 


3.8 


HPAEC none 


2.1 


2.4 


Two Way MLR 7 
day 


1.6 


6.0 


HPAEC TNFalpha 
+ lL-1 beta 


1.2 


1.6 


PBMC rest 


7.3 


9.5 


Lung fibroblast 


6.1 


11.4 
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ncwne j j 


JrJt>MC Jr WM 


lU.O 


io.O 


Lung fibroblast 
TNF alpha +IL-1 
beta 


4.9 


2.5 


PBMC PHA-L 


113 


7.7 


Lung fibroblast IL- 
4 


7.7 


8.4 


Ramos (B cell) none 


2.4 


1.6 


Lung fibroblast IL- 
9 


5.1 


72 


Ramos (B cell) 
ionomycin 


92 


6.6 


Lung fibroblast IL- 
13 


1.6 


4.8 


B lymphocytes 
rWM 


17.9 


13.9 


Lung fibroblast IFN 

gamma 


10.1 


9.0 


B lymphocytes 
CO40L and IL-4 


75.8 


66.0 


Dermal fibroblast 
CCD1070rest 


3.2 


6.0 


EOL-1 dbcAMP 


7.2 


7.1 


Dermal fibroblast 
CCD1070 TNF 
alpha 


27.7 


34.2 


EOL-1 dbcAMP 

r IVI/A./ iVJl lUJ 1 ly t/lJI 


70.7 


60.3 


Dermal fibroblast 

l^r^rM ATA IT 1 Kofo 


3.5 


0.6 


Dendritic cells none 


n.7 


14.1 


Dermal fibroblast 
IFN gamma 


4.6 


2.9 


Dendntic cells LPS 


2.6 


3.8 


Dermal fibroblast 
IL-4 


7.7 


6.2 


Dendritic cells anti- 
CD40 


18.6 


17.6 


IBD Colitis 2 


5.1 


3.0 


Monocytes rest 


3.4 


11.3 


IBD Crohn's 


14.9 


6.3 


Monocytes LPS 


L4 


0.2 


Colon 


[33.0 


43.5 


Macrophages rest 


20.0 


19.6 


Lung 


16.9 


10.1 


Macrophages LPS 


1.2 


1.6 


Thymus 


185.9 


73.7 


HUVEC none 


2.9 


2.6 


Kidney 


jioo.o 


61.1 


HUVEC starved 


5.6 


5.8 




1 



Table EF. Panel CNS 1 



Tissue Name 


Rel. Exp,(%) Ag3055, Run 
171694541 


Tissue Name 


Rel. Exp.(%) Ag3055, Run 
171694541 


BA4 Control 


10.2 


BA17PSP 


1.2 


BA4 Control2 


23.8 


BA17 PSP2 


5.4 


BA4 

Alzheimer's2 


4.9 


Sub Nigra Control 


11.7 


BA4 Parkinson's 


35.8 


Sub Nigra Control2 


6.2 


BA4 Parkinson's2 


60.3 


Sub Nigra 
Alzheimer's2 


8.0 


BA4 Huntington's 


18.7 


Sub Nigra 
Parkinson*s2 
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BA4 

Huntington*s2 


16.2 


Sub Nigra 
Huntington's 


23.0 


BA4PSP 


3.8 


Sub Nigra 
Huntington's2 


33.4 
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BA4PSP2 


13.2 


Sub Nigra PSP2 j2.0 


BA4 Depression 


26.6 


Sub Nigra Dejffession 


4.6 


BA4 Depression2 


22.1 


Sub Nigra 
Depression2 


9.3 


BA7 Control 


25.9 i 


CjIod Failaaus Control 


U.o 


BAT Control2 


11.2 


Glob Palladus 
Control2 


0.0 


R A7 

Al2iieimer*s2 


24.7 


Glob Palladus 
Alzheimer's 


2.4 


BA7 Parkinson's 


33.2 


Glob Palladus 
Alzheimer*s2 


9.7 


BA7 Parkinsorfs2 


41.8 


Glob Palladus 

r oTKinson b 


1.0 


BA7 Huntington's 


32.3 


Parkinson's2 


14.3 


BA7 

Huntington's2 


100.0 


Glob Palladus PSP 


3.7 


BA7 PSP 


40.1 


ntnK Pallorliic PQP9 

vjiOD r^aiiaaus r^oi^z 




BA7 PSP2 


18.7 


Glob Palladus 
Depression 


10.1 


RAT F)pnre<;<Nion 


4.6 


Temp Pole Control 


7.1 


BA9 Control 


12.7 


Temp Pole Control2 


35.4 


BA9 Control2 


42.6 


Temp Pole 
Al2heimer's 


6.2 


BA9 Alzheimer's 


4.8 


Temp Pole 
Alzheimer*s2 


7.6 


BA9 

A 1 zh einier's2 


27.2 


Temp Pole Parkinson's 


40.3 


BA9 Parkinson's 


39.2 


Temp Pole 

Pci ■pt' 1 n C £^TI * C*? 


33.2 


BA9 Parkinson*s2 


40.3 


1 enip X oie 
Huntington's 


18.2 


BA9 Huntington's 


16.6 


Temp Pole PSP 


2.2 


BA9 

Hiintington's2 


31.0 


Temp Pole PSP2 


0.0 


T> AO pep 


o.o 


Temp Pole 
Depression2 


17.1 


B A9 PSP2 


25 


Cing Gyr Control 


34.4 


BA9 Depression 


12.9 


Cing Gyr Control2 


18.8 


BA9 Depression2 


14.9 


Cinff Gvr Alzheimer's 


11.5 


BA17 Control 


30.4 


Cing Gyr 
Alzheimer's2 


18.7 


BA17 Control2 


!l4.4 


Cing Gyr Parkinson's 


^40.6 


BA17 

Alzheimer's2 


18.3 


Cing Gyr Parkinson's2 


48.6 


BA17 Parkinson's 


|52.9 


Cing Gyr Huntington's 


20.4 


BA17 

Parkinson's2 


|42.3 


Cing Gyr 
Huntington's2 


32.8 
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BA17 

Huntington's 


17.0 


CingGyrPSP 


2.2 


BA17 

Huntington's2 


35.8 


Cing GyrPSP2 


2.9 


BA17 Depression 


34.2 


Cing Gyr Depression 


17.8 


BA17 

Depression2 


55.9 


Cing Gyr Depression2 


10.2 



CNS_neiirodegeiieratioii_vl.O Summary: Ag3055 Results from two experiments with the 
NOV? gene are not included because the amp plot indicates that there were experimental 
difficulties with this run. 

Panel 1.3D Summary: Ag3055/3061 The NOV7 gene was run on 2 independent panels 
with excellent concordance between the panels. There is a low level of expression in most of 
the tissues in this panel, with the highest expression in a breast cancer cell line T47D 
(CTs=29). Therefore, expression of this gene may be used as a diagnostic marker for breast 
cancer. Furthermore, inhibition of this gene product using antibodies or amall molecule 
inhibitors may be useful for the treatment of breast cancer. 

Among metabolic tissues, this gene has low levels of expression in pancreas, thyroid, 
pituitary, adrenal, adult and fetal heart, adult and fetal skeletal muscle, adult and fetal liver, 
and adipose. Therefore, this putative esterase may be a small molecule target for the 
treatment of metabolic and endocrine disease, including the thyroidopathies. Types 1 and 2 
diabetes, and obesity. 

In addition, this gene exhibits moderate expression throughout the brain, indicating a 
functional role in the CNS. Neuropathy target esterase is a known mediator of neuronal 
degeneration, a common feature of diseases such as Alzheimer's disease, Parkinson's disease, 
Huntington's disease, and other diseases involving neurodegeneration. Therefore, agents that 
enhance the function of this gene product may have utility as therapeutics in the treatment of 
these diseases. 

References: 

Lush ND, Li Y, Read DJ, Willis AC, Glynn P. Neuropathy target esterase and a 
homologous Drosophila neurodegeneration-associated mutant protein contain a novel domain 
conserved from bacteria to man. Biochem J 1998 May 15;332 ( Pt l):l-4 

The N4erminal amino acid sequences of proteolytic fragments of neuropathy target 
esterase (NTE), covalently labelled on its active-site serine by a biotinylated 
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organophosphorus ester, were determined and used to deduce the location of this serine 
residue and to initiate cloning of its cDNA. A putative NTE clone, isolated from a human 
foetal brain cDNA library, encoded a 1327 residue polypeptide with no homology to any 
known serine esterases or proteases. The active-site serine of NTE (Ser-966) lay in the centre 
of a predicted hydrophobic helix within a 200-amino-acid C-terminal domain with marked 
shnilarity to conceptual proteins in bacteria, yeast and nematodes; these proteins may 
comprise a novel family of potential serine hydrolases. The Swiss Cheese protein which, 
when mutated, leads to widespread cell death in Drosophila brain [Kretzschmar, Hasan, 
Sharma, Heisenberg and Benzer (1997) J. Neurosci. 17, 7425-7432], was strikingly 
homologous to NTE, suggesting that genetically altered NTE may be involved in human 
neurodegenerative disease.(NTE), covalently labelled on its active-site serine by a 
biotinylated organophosphorus ester, were determined and used to deduce the location of this 
serine residue and to initiate cloning of its cDNA. A putative NTE clone, isolated from a 
human foetal brain cDNA library, encoded a 1327 residue polypeptide with no homology to 
any known serine esterases or proteases. The active-site serine of NTE (Ser-966) lay in the 
centre of a predicted hydrophobic helix within a 200-amino-acid C-terminal domain with 
marked similarity to conceptual proteins in bacteria, yeast and nematodes; these proteins may 
comprise a novel family of potential serine hydrolases. The Swiss Cheese protein which, 
when mutated, leads to widespread cell death in Drosophila brain [Kretzschmar, Hasan, 
Sharma, Heisenberg and Benzer (1997) J. Neurosci. 17, 7425-7432], was strikingly 
homologous to NTE, suggesting that genetically altered NTE may be involved in human 
neurodegenerative disease. 

Panel 2.2 Summary: Ag3055 Significant expression of the NOV7 gene is restricted to 
kidney cancer samples. The highest level of expression is seen in a kidney cancer sample 
(CT=32.72). In addition, there is slightly higher expression in two kidney cancers compared 
to the normal adjacent tissue. Thus, this gene could be used as a diagnostic marker for the 
presence of kidney cancer. Furthermore, antibodies or small molecule inhibitors could 
potentially be used for the treatment of kidney cancer. 

Panel 4D Summary: Ag3055/Ag3061 Two experiments produce results that are in excellent 
agreement. This gene, a neuropathy target esterase homolog is expressed at a moderate level 
in several preparations of activated and resting T lymphocytes, activated B lymphocytes, the 
eosinophil cell line Eol-l, cytokine-activated lung and skfai fibroblasts and lung 
mucoepidermoid NCI-H292 cells (CT range 29-33). This widespread expression in both cell 
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lines and tissues involved in the autoimmune response suggests that small molecules that 
antagonize the NOV7 gene product may reduce or eliminate the symptoms in patients with 
autoimmune and inflammatory diseases, includmg Crohn's disease, ulcerative colitis, 
multiple sclerosis, chronic obstructive pulmonary disease, asthma, emphysema, rheumatoid 
arthritis, lupus erythematosus, or psoriasis. 

Panel CNS_1 Summary: Ag3055 The results of this experiment confirm expression of the 
NOV? gene in the brain. Please see Panel 1 .3D for discussion of utility of this gene in the 
central nervous system. 

F. NOV8 - CG57119-01: ACID-SENSITIVE POTASSIUM CHANNEL 
PROTEIN TASK 

Expression of the NOV8 gene was assessed using the primer-probe sets Ag241 and 
Ag3074, described in Tables FA and FB. Results of the RTQ-PCR runs are shown in Tables 
FC,FD, FE, FF andFG. 

Table FA , Probe Name Ag241 



Primers 


rrr ... • 

Sequences jLength 


Start Position 


SEQ ID 
NO: 


Forward 


5*-cagggtcgaatctggaatgg-3' |20 


141 


485 


Probe 


TET-5'-tctggcttcagctatcagggcaccc-3-TAMRA|25 


111 


486 


Reverse 


5-cccgtcatccgtttccaat-3' jl9 


83 


487 



Table FB . Probe Name Ag3074 



Primers 


Sequences 


Length 


Start Position 


SEQ ID 
NO: 


Forward 


5'-gctccttctacttcgccatc-3' 


20 


611 


48S 


Probe 


TET-5'-tcatcactaccatcgagtacggccac-3'-TAMRA 


26 


581 


489 


Reverse 


S'-acatgcagaagaccttgcc-3* 


19 


541 


490 



Table FC> Panel 1.3D 



Tissue Name 


ReL 

Exp.(%) 

Ag241, 

Run 

155695586 


ReL 

Exp,(%) 

Ag241, 

Run 

163728044 


Rel. 

Exp.(%) 
Ag3074, 
Run 

163724451 


Tissue 
Name 


ReL 

Exp,(%) 

Ag241, 

Run 

155695586 


ReL 

Exp.(%) 

Ag241, 

Run 

163728044 


ReL 
Exp.(%) 
Ag3074, 
Run 
163724451 


Liver 

adenocarcinoma 


0.3 


1.7 


6.9 


Kidney 
(fetal) 


1.0 


0.0 


0.4 
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Pancreas 


0.5 


12 


0.6 


Renal ca. 
786>0 


0.0 


0.0 


U.U 


Panrrpatic ca 
CAPAN2 


0.4 


0.0 


0.0 


Renal ca. 
A498 


0.4 


0.0 


A A 
U.U 


Adrenal gland 


1.4 


1.3 


0.6 


RXF393 


0.1 


0.4 


2.1 


Thyroid 


4.1 


4.4 


19.8 


Renal ca. 

ACHN 


14.0 


23.5 


28.7 


Salivary gland 


1.0 


0.1 


1.3 


Renal ca. 
UO-31 


0.0 


0.0 


0.0 


Pituitary gland 


2.9 


2.0 


O.O 


Renal ca. 
TK-10 


A A 

U.U 


A A 
U.U 


0.0 


orain (^leiai / 




u.u 


u.u 


Liver 


U.u 


ft ft 
U.U 


0.0 


Brain (whole) 


0.1 


0.3 


0.9 


Liver (fetal) 


0.0 


0.0 


0.0 


Brain (amygdala) 


0.4 


0.1 


0.0 


Liver ca. 
(hepatoblast) 


0.0 


0.0 


0.0 


Brain 

(cerebellum) 


0.0 


0.0 


0.0 


Lung 


3.8 


0.7 


L4 


Brain 

(hippocampus) 


1.3 


0.0 


0.0 


Lung (fetal) 


0.0 


0.4 


0.3 


Brain (substantia 
nigra) 


0.2 


0.0 


0.4 


Lung ca. 
(small cell) 
LX-1 


0.2 


0.0 


0 0 


Brain (thalamus) 


0.2 


0.0 


1.2 


Lung ca. 
(small cell) 
NCI-H69 


2.7 


1.4 


0.3 


Cerebral Cortex 


0.1 


23 


1.5 


Lung ca. 
(s.cell var.) 
SHP-77 


0.0 


0.0 


0.0 


Spinal cord 


0.6 


0.9 


0.7 


Lung ca. 
(large 
cen)NCI- 
H460 


0.2 


0.6 


0.0 


glio/astro U87- 
MG 


0.2 


0.4 


0.6 


Lung ca. 
(non-sm. 
cell)A549 


4.1 


1.2 




glio/astro U- 118- 
MG 


7.8 


2.9 


4.3 


Lung ca. 

(non-s.cell) 

NCI-H23 


'>A A 

39.0 


50.0 


45.1 


astrocytoma 
SW1783 


u.u 


u.u 


A A 
U.U 


Lung ca. 

(non-s.cell) 

HOP-62 


A O 

U.Z 


A A 

U.U 


0.0 


neuro*; met SK- 
N-AS 


0.6 


0.0 


0.0 


Lung ca. 

(non-s.cl) 

NCI-H522 


0.2 


0.4 


0.3 


astrocytoma SF- 
539 


0.1 


0.6 


0.0 


Lung ca. 
(squam.) 
SW900 


0.8 


1.1 


0.0 
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astrocytoma 
SNB-75 


23 


0.6 


0.0 


Lung ca. 
(squam.) 
NCI-H596 


0.1 


Oi> 


1.0 


glioma SNB-1 9 


0.0 


0.5 


0.7 


Mammary 
gland 


3.0 


1.6 


0.4 


glioma U251 


0.0 


03 


0.0 


Breast ca.* 
(pl.el) MCF- 
7 


43.5 


73.2 


43 S 


glioma SF-295 


0.2 


0.4 


0.6 


Breast ca.* 
(pl.ef) 

231 


0.0 


0.0 


0.0 


Heart (fetal) 


0.6 


0.0 


03 


Breast ca.* 

(pLef) T47D 


9.2 


9.2 


24.0 


Heart 


0.2 


1.5 


6.5 


Breast ca. 
BT-549 


0.S 


0.0 


0.4 


Skeletal muscle 
(fetal) 


1.4 


2.6 


1.7 


Breast ca. 
MDA-N 


0.1 


0.0 


0.0 


Skeletal muscle 


0.5 


0.4 


0.0 


Ovary 


33 


6.5 


11.0 


Bone marrow 


0.0 


0.0 


0.0 


\Jy ax\cxl\ L/Cl. 

OVCAR-3 


11.6 


18.7 


19.1 


Thymus 


0.0 


0.8 


2.2 


Ovarian ca. 
OVCAR-4 


10.4 


5.7 


9.0 


Spleen 


0.0 


0.0 


0.7 


Ovarian ca. 

OVPAT? ^ 
\J V v.../\Iv-_> 


13 


23 


3.4 


Lymph node 


0.4 


0.0 


0.0 


Ovarian ca. 
OVCAR-8 


7.9 


93 


12.8 


Colorectal 


0.9 


0.4 


1.0 


Ovarian ca. 
IGROV-1 


0.1 


0.0 




Stomach 


1.5 


0.7 


0.7 


Ovarian ca.* 
(ascites) SK- 
OV-3 


3.8 


2.8 


5.1 


Small intestine 


0.4 


0.3 


03 


Uterus 


2.1 


13 


3.9 


Colon ca.SW480 


1.3 


0.6 


0.6 


Placenta 


0.0 


0.7 


0.5 


Colon ca.* 
SW620(SW480 


0.2 


13 


0.0 


Prostate 


0.6 


0.4 


7.2 


Colon ca. HT29 


0.0 


0.0 


0.0 


Prostate ca.* 

(bone 

met^PC-S 


173 


213 


33.9 


Colon ca.HCT- 
116 


0.0 


0.0 


0.0 


Testis 


7.1 


3.2 


9.2 


Colon ca. CaCo- 
2 


3.0 


4.7 


2.4 


Melanoma 
Hs688(A).T 


0.0 


0.0 


0.0 


Colon ca. 
tissue(OD03866) 


1.5 


3.0 


5.4 


Melanoma* 
(met) 

Hs688(B).T 


0.0 


0.0 


1.4 
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Colon ca. HCC- 
2998 


2.4 


1.6 


1.0 


Mei^oma 
UACC-62 


0.0 


0.5 


0.0 


Gastric ca.* 
(liver met) NCI- 
N87 


100.0 


100.0 


100.0 


Melanoma 
M14 


0.0 


0.0 


0.4 


Bladder 


3.0 


5.8 


12.9 


Melanoma 
LOXIMVI 


0.0 


0.0 


0.0 


Trachea 


4.4 


3.1 


9.0 


Melanoma* 

i(meOSK- 

iMEL-5 


0.0 


0.0 


0.0 


Kidney 


0.1 


0.4 


1.6 


Adipose 


2.3 


2.2 


2.6 



Table FD. Panel 2D 



Tissue Name 


Rel. 

Exp.(%) 
Run 

155695603 


Rel. 

Exp.(%) 

Ag241, 

Run 

163578011 


Rel. 

Exp.(%) 
Ag3074, 
Run 

163578433 


Tissue 
Name 


Rel. 

Exp.(%) 

Ag241, 

Run 

155695603 


Rel. 

Exp.(%} 

Ag241, 

Run 

163578011 


Rel. 

Exp.(%) 
Ag3074, 
Run 

163578433 


Normal 
Colon 


3.7 


6.9 


2.1 


Kidney 
Margin 
8120608 


1.3 


1.2 


2.3 


CC Well to 
Mod Diff 
(OD03866) 


6.0 


6.3 


1.9 


Kidney 
Cancer 
8120613 


0.0 


0.0 


0.0 


CC Margin 
(OD03866) 


0.7 


0.0 


0.9 


Kidney 
Margin 
8120614 


3.2 


0.0 


1.7 


CC Gr.2 

rectosigmoid 

(OD03868) 


0.0 


0.0 


0.0 


Kidney 
C^cer 
9010320 


5.1 


3.3 


2.9 


CC Margin 
(OD03868) 


0.0 


0.0 


0.0 


Kidney 
Margin 
9010321 


6.2 


2.6 


3.6 


CC Mod Diff 
(OEX)3920) 


0.0 


0.0 


0.0 


Normal 
Uterus 


6.3 


10.8 


8.8 


CC Margin 
(ODO3920) 


0.0 


0.8 


0.0 


Uterus 
Cancer 
064011 


3.5 


1.1 


3.5 


CC Gr.2 
ascend colon 
(OD03921) 


18.7 


14.1 


7.1 


Normal 
Thyroid 


183 


10.5 


5.8 


CC Margin 
(OD03921) 


1.1 


0.6 


1.3 


Thyroid 

Cancer 

064010 


21.8 


23.0 


15.6 


CCfrom 
Partial 

Hepatectomy 

(ODO4309) 

Mets 


0.4 


3.8 


0.6 


Thyroid 

Cancer 

A302152 


15.8 


15.4 


12.1 


Liver Margin 


0.0 


0.0 


0.0 


Thyroid 


6.0 


8.0 


5.5 
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(ODO4309) 


I 1 






Margin 
A302153 








Colon mets to 
lung 

(OD04451- 
01) 


1.0 


3.5 


0.0 


Normal 
Breast 


8.5 


12.7 


4.8 


Lung Margin 

(OD04451- 

02) 


2.4 


1.6 


0.3 


Breast 

Cancer 

(OD04566) 


71.7 


79.6 


43.2 


jNormai 
Prostate 
6546-1 


3.1 


15.8 


10.8 


Breast 
Cancer 
(OD04590- 
01) 


88.3 


55.9 


100.0 


Prostate 

Cancer 

(01)04410) 


OJ 


2.0 


2.1 


Breast 

Cancer 

Mets 

(OD04590- 
03) 


66.9 


59.5 


80.1 


Prostate 
Margin 
(OD04410) 


3.5 


2.2 


1.2 


Breast 

Cancer 

Metastasis 

(OD04655- 

05) 


100.0 


100.0 


82.9 

1 

i 


Prostate 
Cancer 
(OD04720- 
01) 


1.6 


1 

1.4 


1.4 


Breast 
Cancer 
064006 


ko 

1 


s 

1 

7.9 |2.8 

i 

i 


Prostate 
Margin 
(OD04720- 
02) 


5.3 


5.8 


6.5 


Breast 
Cancer 
1024 


13.1 

1 - . 


— „ „. , 

4.1 13.6 

i 


Normal Lung 
061010 


2.3 


3.1 


2.0 


Breast 

Cancer 

9100266 


90.8 


80.7 


i 

69.7 


Lung Met to 

Muscle 

(OD04286) 


1.7 


1.2 


0.2 


Breast 

Margin 

9100265 


16.8 


18.6 


7.1 


Muscle 
Margin 
(OD04286) 


5.6 


3.5 


5.2 


Breast 
Cancer 
A209073 


4.2 


0.3 


2.0 


Lung 
Malignant 

Cancer 

\KJLJ\JJ 1 AO J 


12.8 


8.8 


13.9 


Breast 

Margin 

A209073 


4.5 


2.0 


3.1 


Lung Margin 
(OD03126) 


2.0 


2.9 


2.9 


Normal 
Liver 


0.0 


0.0 


0.0 


Lung Cancer 
(OD04404) 


3.1 


2.0 


2.7 


Liver 

Cancer 

064003 


0.0 


0.0 


0.0 


Lung Margin 
(OD04404) 


11.0 


9.6 


4.3 


Liver 
Cancer 


0.0 


0.0 


0.8 
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L_ 






1025 








Lung Cancer 
(OD04565) 


0.0 


0.0 


0.0 


Liver 

Cancer 

1026 


4.3 


7.6 


3.4 


Lung Margin 
(OD04565) 


2.4 


1.7 


0.5 


Liver 

Cancer 

6004-T 


0.0 


0.0 


0.4 


Lung Cancer 

(OD04237- 

01) 


0.0 


0.0 


1.4 


Liver 

Tissue 

6004.N 


0.7 


0.0 


0.0 


Lung Margin 

(OD04237- 

02) 


5.4 


5.6 


2.7 


Liver 

Cancer 

6005-T 


2.6 


7.4 


3.5 


Ocular Mel 

Met to Liver 
(ODO4310) 


3.1 


0.9 


0.7 


Liver 

Tissue 

6005-N 


0.5 


0.6 


0.6 


Liver Margin 
(ODO4310) 


0.0 


0.0 


0.8 


Normal 
Bladder 


12.9 


15.0 


8.3 


Melanoma 
Mets to Lung 
(OD04321) 


12.9 


12.9 


13.9 


Bladder 

Cancer 

1023 


0.4 


1.6 


02 

i 


Lung Margin 
(OD04321) 


7.5 


L 

■ 


8.8 


Bladder 

Cancer 

A302173 


0.7 


2.9 


0.0 


Normal 
Kidney 


2.3 


0.0 


1.1 


Bladder 
Cancer 
(OD04718- 
01) 


9.6 


1 

J 

19.5 j5.6 


Kidney Ca, 
Nuclear grade 
2(OD04338) 


2.4 


6.0 

i 

i 


1.5 


Bladder 

Normal 
Adjacent 
(OD04718- 
03) 


153 

\ 


10.2 


t 

i 

i 

6J 


Kidney 
Margin 
(OD04338) 


23 


2.9 


1.5 


Normal 
Ovary 


53 


5.7 


9.4 


Kidney Ca 
Nuclear grade 
1/2 

(OD04339) 


2.5 


7.3 


0.8 


Ovarian 
Cancer 

064008 


70.7 

i 


62.9 


67.8 


Kidney 
Margin 
(OD04339) 


1.6 


3.9 


1.4 


Ovarian 
Cancer 
(OD04768- 
07) 


i 

9.9 


4.8 


53 


Kidney Ca, 
Clear cell 

type 

(OD04340) 


0.9 


b.o 

1 

i . . 


0.2 


Ovary 
Mar^n 
{OD04768- 
08) 


1.2 


6.2 


4.0 


Kidney 
Margin 
(OD04340) 


4.6 


1 

[ 

3.7 


1.7 


Normal 
Stomach 


2.1 


1.9 


0.9 
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Kidney Ca, 
Nuclear grade 
3 (OD04348) 


0.0 


0.2 


1.1 


Gastric 

Cancer 
9060358 


2.5 


2.9 


0.4 




Kidney 
Margin 
(OD04348) 


2.2 


0.7 


2.4 


Stomach 

Margin 

9060359 


0.6 


2.4 


1.2 




Kidney 
Cancer 
(OD04622- 
01) 


0.0 


0.7 


0.1 


Gastric 
Cancer 

9060395 


8.4 


6.3 


2.5 




Kidney 
Margin 
(OD04622- 

03) 


5.6 


5.8 


4.7 


Stomach 

Mar^n 

9060394 


4.0 


2.5 


0.9 


b ■ 

Q . 

J2 


Kidney 
Cancer 
(OD04450- 
01) 


27.0 


16.3 


9.0 


Gastric 
Cancer 
9060397 


0.4 


1.9 


0.7 


. i'^ 


Kidney 

Margin 

(OD04450- 

03) 


0.0 


1.0 


1.4 


Stomach 

Margin 

9060396 


1.5 


1.2 


1.0 


ru' 
1=^ 


Kidney 
Cancer 
8120607 


0.6 


1.7 


2.2 


Gastric 
Cancer 
064005 


4.5 


3.0 


2.6 



%% Table FE. Panel 3D 



Tissue Name 


Rel. Exp.(%) 

Ag24URun 

165022800 


Tissue Name 


Rel. Exp.(%) 
Ag241, Run 
165022800 


Daoy- Medulloblastoma 


0.8 


Ca Ski- Cervical epidermoid 
carcinoma (metastasis) 


0.0 


TE671- Medulloblastoma 


2.1 


ES-2- Ovarian clear cell 
carcinoma 


0.0 


D283 Med- 
Medulloblastoma 


0.0 


Ramos- Stimulated with 
PMA/ionomycin 6h 


0.0 


PFSK-1- Primitive 
Neuroectodermal 


0.0 


Ramos- Stimulated with 
PMA/ionomycin 14h 


0.0 


XF-498- CNS 


0.0 


MEG-01- Chronic myelogenous 
leukemia (megokaryoblast) 


0.0 


SNB-78- Glioma 


0.0 


Raji- Burkitt's lymphoma 


0.0 


SF-268- Glioblastoma 


0.0 


Daudi- Burkitt's lymphoma 


0.0 


T98G- Glioblastoma 


0.0 


U266- B-cell pIasmac3^oma 


6.8 


SK-N-SH- Neuroblastoma 
(metastasis) 


1.8 


CA46- Burkitt^s lymphoma 


0.0 


SF-295- Glioblastoma 


0.0 


RL- non-Hodgkin*s B-cell 
lymphoma 


0.0 


Cerebellum 


1.1 


JMl- pre-B-cell lymphoma 


0.0 


Cerebellum 


0.0 


Jurkat- T cell leukemia 


0.0 
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NC1-H292- 
Mucoepidermoid lung 
carcinoma 


O.o 


1 " X^ry liiiUICUKCJiIla 


0 0 


OMo-l 14- bmaii cell lung 
cancer 


0.8 


HUT 78- T-cell lymphoma 


0.0 


DMb-7y- omall cell lung 
cancer 


9.5 


U937- Histiocytic Iympb<Hna 


0.0 


NCI-H146- Small cell lung 
cancer 


0.7 


KU-812- Myelogenous leukemia 


0.0 


NC1-H520- Small cell lung 

cancer 


4.0 


/oy-Jr- i..^iear ceii renaj 
carcinoma 


0.0 


NCi-N4 1 /- Small ceil lung 
cancer 


0.8 


i^aKi-z- v^iear ecu renai 
carcinoma 


0.0 


NCI-H82- Small cell lung 

cancer 


0.0 


SW 839- Clear cell renal 
carcinoma 


0.0 


NCI-H157- Squamous cell 
lung cancer (metastasis) 


0.0 


G401- Wilms' tumor 


28.1 


NCI-H1155- Large cell 
lung cancer 


0.0 


Hs766T- Pancreatic carcinoma 
(LN metastasis) 


0.4 


NCI-H1299- Large cell 
lung cancer 


0.0 


CAPAN-1- Pancreatic 
adenocarcinoma (liver 
metastasis) 


z.u 


NCI-H727- Lung carcinoid 


1.0 


SUoo.oo- Pancreatic carcmoma 
(liver metastasis) 


0.7 


NCI-UMC-1 1- Lung 
carcinoid 


0.0 


15xJrC-i- Pancreatic 
adenocarcinoma 


1.7 


LX-1- Small cell lung 
cancer 


0.0 


HrAC- Pancreatic 

adenocarcinoma 


1.4 


Colo-205- Colon cancer 


0.0 


ivii/\ rSx^ar^- jrancreaiic 
carcinoma 


4.6 


KM12- Colon cancer 


0.0 


v^r "AC- 1 - r ancreaiic uuciai 
adenocarcinoma 


0.0 


KM20L2- Colon cancer 


1.8 


x^AiNC-i- r ancreaiic epiuieiioiu 

ductal carcinoma 


15.7 


NCI-H716- Colon cancer 


0.0 


T24- Bladder carcinma 
(transitional cell) 


0.0 


SW-48- Colon 
adenocarcinoma 


0.0 


5637- Bladder carcinoma 


2.3 


SWl 1 lo- Colon 
adenocarcinoma 


0.0 


HT-1 197- Bladder carcinoma 


0.0 


IjS 1/41- colon 
adenocarcinoma 


14.4 


uivi-uv^-j- JDiauuer carcmma 
(transitional cell) 


0.0 


SW-948- Colon 
adenocarcinoma 


0.0 


A204- Rhabdomyosarcoma 


0.0 


SW-480- Colon 
adenocarcinoma 


0.6 


HT-1080- Fibrosarcoma 


0.0 


NCI-SNU-5- Gastric 
carcinoma 


1.4 


MG-63- Osteosarcoma 


0.0 


KATO III- Gastric 


0.0 


SK-LMS-1- Leiomyosarcoma 


7.6 
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carcinoma 




(vulva) 




NCl-bN U-1 o- uastnc 

carcinoma 


0.0 


oJivrUU- r\iiaDuomyosarcomd. 
(met to bone marrow) 


0.0 


fsid-oiN u-1- oasinc 
carcinoma 


0.0 


A431- Epidermoid carcinoma 


0.0 


Kr-1- vjastric 
adenocarcinoma 


0.0 


WM266-4- Melanoma 


19.1 


Kr-4o- oastnc 
adenocarcinoma 


0.0 


UkJ It-J" r rOSlalC UaivlIlUlIla 

(brain metastasis) 


0.0 


Tk Jft/'XT AC /^ i, i-.«..? r. 

MKN-45- Gastnc 
carcinoma 


18.2 


MJJA-iViJc>-40o- i>reasi 
adenocarcinoma 


0.5 


iNd-iNo /- oastnc 
carcinoma 


12.2 


«j^^-*r- o(|UcUnuUb vClI 

carcinoma of tongue 


0.0 


OVCAR~5- Ovarian 
carcinoma 


0.0 


SCC-9- Squamous cell 
carcinoma of tongue 


0.0 


RL95-2- Uterine carcinoma 


0.0 


SCC-15- Squamous cell 
carcinoma of tongue 


0.0 


HelaS3- Cervical 
adenocarcinoma 


100.0 


CAL 27- Squamous cell 
carcinoma of tongue 


2.0 



Table FF. Panel 4. ID 



Tissue Name 


ReLExp.(%)Ag3074, 
Run 248389309 


Tissue Name 


Rel.Exp.(%)Ag3074, 
Run 248389309 


Secondary Thl act 


0.0 


HUVECIL-lbeta 


0.0 


Secondary Th2 act 


1.6 


HUVEC IFN gamma 


0.0 


Secondary Trl act 


3.3 


HUVEC TNF alpha + IFN 

gamma 


0.0 


Secondary Thl rest 


0,0 


HUVEC TNF alpha + IL4 ^ 


0.0 


Secondary Th2 rest 


0.0 


^HUVEC IL-11 


0.0 


Secondary Trl rest 


0.0 


Lung Microvascular EC none 


0.0 


Primary Thl act 


5.2 


Lung Microvascular EC 
TNFalpha+IL-lbeta 


0.0 


Primary Th2 act 


0.0 


Microvascular Dermal EC 

none 


0.0 


Primary Trl act 


0.0 


Microsvasular Dermal EC 
TNFalpha + IL-lbeta 


0.0 


Primary Thl rest 


0.0 


Bronchial epithelium 
TNFalpha + ILlbeta 


0.0 


Primary Th2 rest 


0.0 


Small airway epithelium 
none 


0.0 


Primary Trl rest 


0.0 


Small airway epithelium 
TNFalpha+IL-lbeta 


0.0 


CD45RA CD4 
lymphocyte act 


5.6 


Coronery artery SMC rest 


0.0 


CD45ROCD4 
lymphocyte act 


4.0 


Coronery artery SMC 
TNFalpha + IL-lbeta 


0.0 


CDS lymphocyte act 


0.0 


Astrocytes rest 


0.0 


Secondary CDS 


0.0 


Astrocytes TNFalpha + IL- 


5.8 
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lymphocyte rest 


jlbeta 




occonuary k^uo 
lymphocyte act 


0.0 


KU-812 (Basophil) rest 


0.0 


CD4 lymphocyte none 


0.0 


KU-812 (Basophil) 
PMA/ionomvcin 


0.0 


2ry Thl/Th2/Trl anti- 
CD95 CHll 


0.0 


CCD 11 06 (Keratinocytes) 
none 


0.0 


LAK cells rest 


0.0 


CPDI 1 06 rKeratinocvtes"^ 
TNFalpha + IL-lbeta 


0.0 


LAK cells IL-2 


0.0 


Liver cirrhosis 


0.0 


LAK cells 1L-2+IL-12 


0.0 


NCI-H292 none 


13.7 


LAK cells IL-2+IFN 

gamma 


0.0 


NCI-H292 IL-4 


0.0 


LAKcelIsIL-2+lL-18 


0.0 


NCI~H292 IL-9 


17.4 


LAK cells 
PMA/ionomycin 


0.0 


NCI-H292 IL-13 


20.7 


NK Cells IL-2 rest 


0.0 


NCI-H292 IFN gamma 


13.0 


Two Way MLR 3 day 


0.0 


HPAEC none 


0.0 


Two Way MLR 5 day 


0.0 


HPAEC TNF alpha + IL-1 

beta 


A A 
U.U 


Two Way MLR 7 day 


0.0 


Lung fibroblast none 


0.0 


PBMC rest 


0.0 


i^ung iiDroDiasi iiNr aipna ^ 
IL-l beta 


0.0 


PBMC PWM 


U-U 


T tt-nrr ■f»l>'i*/*^VvIfiC'l" TT /I 

i-^ung iiDroDiasi jLj-^ 


0.0 


PBMC PHA-L 


3.4 


Lung fibroblast IL-9 


0.0 


Ramos (B cell) none 


0.0 


Lung fibroblast IL-13 


0.0 


Kamos (^t5 ceiy 
ionomycin 


0.0 


Lung fibroblast IFN gamma 


0.0 


B lymphocytes PWM 


0.0 


Dermal fibroblast CCD1070 


4.2 


B lymphocytes CD40L 
and IL-4 


1.0 


Dermal fibroblast CCD1070 
TNF alnha 

J. 1 NX CU L/S let. 


0.0 


EOL-1 dbcAMP 


0.0 


r><»rma1 fihroh!a<it CCD 1070 

IL-1 beta 


1.0 


EOL-1 dbcAMP 
PMA/ionomycin 


0.0 


Dermal fibroblast IFN 
gamma 


1 uu.u 


Dendritic cells none 


0.0 


Dermal fibroblast lL-4 


78.5 


Dendritic cells LPS 


0.0 


Dermal Fibroblasts rest 


59.9 


Dendritic ceils anti- 
CD40 


0.0 


Neutrophils TNFa^-LPS 


0.0 


Monocytes rest 


0.0 


Neutrophils rest 


0.0 


Monocytes LPS 


0.0 


Colon 


0.0 


Macrophages rest 


0.0 


Lung 


0.0 


Macrophages LPS 


0.0 


Thymus 


0.0 


HUVEC none 


0.0 


Kidney 


0.0 


HUVEC starved 


0.0 







Table FG . Panel 4D 

433 



Tissue Name 


X>a.t T7v« ^OA\ • 

Kei. c,Xp,\yo) \ 

Ag241,Run 
1650103S0 


1>_1 TJvn 

Ag3074, Run 
162598884 


Tissue Name 


Ag241,Run 
165010380 


Fvn 

Ag3074, Run 
162598884 


Secondary Thl act 


5.6 


0.0 


HUVEC IL-lbeta 


0.0 


0.0 


Secondary Th2 act 


1.7 


0.0 


HUVEC IFN 
gamma 


0.0 


0.0 


Secondary Trl act 


0.0 


0.0 


HUVEC TNF alpha 
+ IFN gamma 


0.0 


0.0 


Secondary Thl rest 


0.0 


0.0 


n U V CA^ I INF aiyila. 

+ IL4 


0.0 


0.0 


5>econdary 1 nz rest 


A A 


A A 
U.U 


ITT TVPr* IT 11 


U.U 


n n 


Secondary Trl rest 


0.0 


0.0 


Lung Microvascular 

EC none 


0.0 


0.0 


Primary Thl act 


40.9 


33.7 


Lung Microvascular 
EC TNFalpha + IL- 
Ibeta 


0.0 


0.0 


Primary Th2 act 


2.0 


0.0 


Microvascular 
Dermal EC none 


0.0 


0.0 


Primarv Trl act 


6.7 


0.0 


Microsvasular 
Dermal EC 
TNFalpha+ IL- 
lbeta 


0.0 


0.0 


Primarv Thl re^t 


1.5 


0.0 


Bronchial 
epithelium 
TNFaJpha + 
ILlbeta 


1.4 


0.0 


Primary Th2 rest 


0.0 


0.0 


Small airway 
epithelium none 


0.0 


0.0 


Primary Trl rest 


0.0 


0.0 


Small airway 
epithelium 
1 Nr ajpna + IL- 
lbeta 


0.0 


0.0 


CD45RA CD4 
lympnocyre aci 


2.7 


0.0 


Coronery artery 
oivn..^ rest 


6.9 


0.0 


CD45RO CD4 
lymphocyte act 


O./ 


A A 
U.U 


Coronery artery 
oivio 1 XNjr aipna ^ 
IL-lbeta 


U.U 


U.U 


CUo lympnocyte act 


A A 
U.U 


Z.O 


Astroc3^es rest 


A A 
U.U 


A A 
U.U 


Secondary CDS 
lymphocyte rest 


1.6 


11.2 

I . _ . . 


Astrocytes 
TNFalpha + IL- 
1 oexa 


1.3 


12.9 


Secondary CDS 

lympnocyie act 


0.0 


0.0 

1 


KU-S12 (Basophil) 

rest 


0.0 


0.0 


CD4 lymphocyte 
none 


0.0 


0-0 

} 


KU-812 (Basophil) 
1 PMA/ionomycin 


0.0 


0.0 


2ry 

Thl/Th2/Trl anti- 
CD95CH11 


0.0 


: 

0.0 


CCD1106 

(Keratinocytes) 

none 


0.0 


0.0 


LAK cells rest 


0.0 


0.0 


CCD1106 
(Keratinocytes) 


0.0 


0.0 
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TNFalpha + IL- 
Ibeta 






LAK cells lL-2 


1.7 


0.0 


Liver cirrhosis 


10.2 


2.7 


TAT/' ^rvll^ TT 0_l_TT 

LAK cells 1L-2H-1JL- 
12 


1.7 


0.0 


Lupus kidney 


0.2 


3.1 


LAK cells IL- 
2+IFN gamma 


0.0 


0.0 


NCI-H292none 


19.6 


9.9 


LAK cells IL-2+ IL- 
18 


0.0 


1.6 


NCI-H292 IL-4 


25.3 


3.5 


LAK cells 

PKf A /i onnin vci n 


02, 


0.0 


NCI-H292 lL-9 


74.7 


31.6 


NK Cells IL-2 rest 


0.0 


0.0 


NCI-H292 IL-13 


21.0 


12.8 


Two Way MLR 3 
day 


0.0 


0.0 


gamma 


232 


29.1 


Two Way MLR 5 

day 


0.0 


0.0 


HPAEC none 


0.0 


0.0 


Two Way MLR 7 
day 


0.0 


0.0 


HPAEC TNF alpha 
+ IL- 1 beta 


0.0 


0.0 


PBMC rest 


0.0 


0.0 


Lung fibroblast 
none 


0.0 


0.0 


PBMCPWM 


14.5 


2.2 


Lung fibroblast 
TNF alpha +IL-1 
beta 


0.0 


0.0 


PBMC PHA-L 14.1 


2.0 


Lung iibroDiast IL- 
4 


0.0 


0.0 


Ramos (B cell) nonefo.O 


0.0 


Lung fibroblast IL- 
9 


0.0 


0.0 


Ramos (B cell) 
ionomycin 


I 

lO-O 


0.0 


Lung fibroblast IL- 
13 


0.0 


2.6 


B lymphocytes 
PWM 


27.2 


4.2 


Lung fibroblast IFN 
gamma 


0.0 


0.0 


B lymphocytes 
CD40LandIL-4 


3-1 


0.0 


Dermal fibroblast 
i^K^iJl\)/\} rest 


5.5 


3.2 


EOL-1 dbcAMP 


0.0 




Dermal fibroblast 
alpha 


1 A 

I .O 


U.U 


EOL-1 dbcAMP 
PMA^onomycin 


0.0 


0.0 


Dermal iioroolast 
CCD10701L-1 beta 


2.6 


0.0 


Dendritic cells none 


0.0 


0.0 


Dermal fibroblast 
IFN gamma 


100.0 


100.0 


Dendritic cells LPS 


0.0 




Dermal fibroblast 
IL-4 


87.7 


61 1 


Dendritic cells anti- 
CD40 


0.0 


0.0 


IBD Colitis 2 


0.0 


0.0 


Monocytes rest 


0.0 


0.0 


IBD Crohn's 


0.0 


0.0 


Monocytes LPS 


1.6 


0.0 


Colon 


52.5 


3.0 


Macrophages rest 


0.0 


0.0 


Lung 


43.5 


25.3 


Macrophages LPS 


0.0 


0.0 


Thymus 


9.7 


3.7 
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HUVECnone jO.O jo.O 


Kidney [4.7 |4.9 


HUVEC starved |0.0 fO.O 


1 1 



Panel 1 Summary: Ag241 Expression of the NOV8 gene is low/undetectable in all samples 
on this panel (CTs>35). {Data not shown.) The amp plot indicates that there is a high 
probability of a probe failure. 

Panel 13D Summary: Ag241/Ag3074 Three experiments with two different probe and 
primer sets produce results that are in very good agreement. Expression of the NOV8 gene in 
this panel is most prominent in cancer cell lines, with highest expression in a gastric cancer 
cell line (CT^28). Significant levels of expression are also seen in cell lines derived from 
prostate cancer, ovarian cancer, breast cancer, lung cancer, and renal cancer. Thus, the 
therapeutic inhibition of this gene activity, through the use of small molecule drugs or 
antibodies, might be of utility in the treatment of the above listed cancer types. In addition, 
expression of this gene could be used as a diagnostic marker for cancer. 

Among metabolic tissues, the NOV8 gene has a low level of expression in adrenal, 
pituitary, heart and adipose. Thus, this gene product may be a small molecule target for the 
treatment of metabolic and endocrine disease, including the adrenalopathies, obesity and 
Type 2 diabetes. 

Results from one experiment with the Ag241 show low/undetectable levels of 
expression in all the samples on this panel (CTs>35). (Data not shown.) 

References: 

Maingret F, Patel AJ, Lesage F, Lazdunski M, Honore E. Lysophospholipids open the 
two-pore domain mechano-gated K(+> channels TREK-1 and TRAAK. J Biol Chem. 2000 
Apr7;275(14):10128-33. 

The two-pore (2P) domain K(+) channels TREK-1 and TRAAK are opened by 
membrane stretch as well as arachidonic acid (AA) (Patel, A. J., Honore, E., Maingret, F., 
Lesage, F., Fink, M., Duprat, F., and Lazdunski, M. (1998) EMBO J. 17, 4283-4290; 
Maingret, F., Patel, A. J., Lesage, F,, Lazdunski, M., and Honore, E. (1999) J. Biol. Chem. 
274, 26691-26696; Maingret, F., Fosset, M., Lesage, F., Lazdunski, M. , and Honore, E. 
(1999) J. Biol. Chem. 274, 1381-1387. We demonstrate that lysophospholipids (LPs) and 
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platelet-activating factor also produce large specific and reversible activations of TREK-1 
and TRAAK. LPs activation is a function of the size of the polar head and length of the acyl 
chain but is independent of the charge of the molecule. Bath application of 
lysophosphatidylcholine (LPC) immediately opens TREK-1 and TRAAK in the cell-attached 
patch configuration. In ekcised patches, LPC activation is lost, whereas AA still produces 
maximal opening. The carboxyl-terminal region of TREK-1, but not the amino terminus and 
the extracellular loop MlPl, is critically required for LPC activation. LPC activation is 
indirect and may possibly involve a cytosolic factor, whereas AA directly interacts with 
either the channel proteins or the bilayer and mimics stretch. Opening of TREK-1 and 
TRAAK by fatty acids and LPs may be an important switch in the regulation of synaptic 
function and may also play a protective role during ischemia and inflammation. 

PMID: 10744694 

Panel 2D Summary: Ag241/Ag3041 The expression of the NOV8 gene was assessed in 
three independent runs with good concordance between the runs. This gene is expressed at a 
higher level in colon, thyroid, breast and bladder cancer samples compared to normal 
adjacent tissues. Hence this gene can be used as a diagnostic marker for these cancers and 
inhibition of the gene product using antibodies or small molecule drugs can be used for the 
treatment of these cancers. 

Panel 3D Summary: Ag241 The expression of the NOV8 gene was assessed in one run. 
This gene is expressed in in several cell lines including melanoma, gastric cancer, kidney 
cancer, cervical cancer and lung cancer cell lines. Thus, the therapeutic inhibition of this gene 
activity, through the use of small molecule drugs or antibodies, might be of utility in the 
treatment of the above listed cancer types. 

Panels 4D/4.1D Summary: Ag241/Ag3074 Two experiments with two different probe and 
primer sets show highest expression of the NOV8 gene in dermal fibroblasts treated with 
IFN-gamma (CTs=30-33). Significant expression is also seen in dermal fibroblasts treated 
with IL-4. This expression suggests that the protein encoded by this gene may be involved in 
skin disorders, such as psoriasis. Significant levels of expression are also seen in both treated 
and untreated samples derived from the mucoepidermoid pulmonary cell line NCI-H292. 
This expression profile suggests that the gene product may also be involved in inflammatory 
processes that affect the lung. Therefore, therapeutic modulation of the expression or function 
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of the protein encoded by this gene may be effective in the treatment of asthma, allergies, 
emphysema and COPD. 

G. NOVIO - CG56860-01: Prostaglandin Omega-Hydroxylase Like Gene 

Expression of the NOVIO gene was assessed using the primer-probe set Ag3038, 
described in Table OA. Results of the RTQ-PCR runs are shown in Table GB. 



Table OA . Probe Name Ag3038 



Primers 


Sequences 


Length 


Start Position 


SEQID 
NO: 


Forward 


5-acagactcccagatggtgtct-3* 


21 


36 


491 


Probe 


TET-5-ctcctccaaggagcctcactgctgag-3-TAMRA 


26 


62 


492 


Reverse 


5'-ggctgccttcaatagtaacaga-3* 


22 


94 


493 



Table GB. Panel 4D 



Tissue Name 


Rel.Exp.(%)Ag3038, 
Run 164528701 


Tissue Name 


Rel. Exp.(%) Ag303S, 
Run 16452S701 


Secondary Thl act 


0.0 


HUVEC IL-lbeta 


0.0 


Secondary Th2 act 


0.0 


HUVEC IFN gamma 


0,0 


Secondary Trl act 


0.0 


HUVEC TNF alpha + IFN 

gamma 


0.0 


Secondary Thl rest 


0.0 


HUVEC TNF alpha + IL4 ; 


0.0 


Secondary Th2 rest 


0.0 


HUVEC IL-11 


0.0 


Secondary Trl rest 


0.0 


Lung Microvascular EC none 


0.0 


Primary Thl act 


0.0 


Lung Microvascular EC 
TNFalpha + lL-lbeta 


0.0 


Primary Th2 act 


0.0 


Microvascular Dermal EC 
none 


0.0 


Primary Trl act 


0.0 


Microsvasular Dermal EC 
TNFalpha + IL-lbeta 


0.0 


Primary Thl rest 


0.0 


Bronchial epithelium 
TNFalpha+ILlbeta 


0.0 


Primary Th2 rest 


0.0 


Small airway epithelium 
none 


0.0 


Primary Trl rest 


0.0 


Small airway epithelium 
TNFalpha+ IL-lbeta 


0.0 


CD45RA CD4 
lymphocyte act 


0.0 


Coronery artery SMC rest 


0.0 


CD45ROCD4 
lymphocyte act 


0.0 


Coronery artery SMC 
TNFalpha + IL-lbeta 


0.0 


CDS lymphocyte act 


0.0 


Astrocytes rest 


0.0 


Secondary CDS 
lymphocyte rest 


0.0 


Astrocytes TNFalpha + IL- 
Ibeta 


0.0 


Secondary CDS 


0.0 


KU-S12 (Basophil) rest 


0.0 
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Iymphoc)46 act 






CD4 lymphocyte none 


0.0 i 


KU-812 (Basophil) 

PlVf A /irtnnmvfin 


0.0 


2rv Thl/Th2/Trl anti- 
CD95 CHll 


0.0 


y^K^Ly 1 1 \J\j ^jvci cL\rin\j\^j j 

none 


0.0 


LAK cells rest 


0.0 


rCDI 1 06 rKeratinocvtes'i 
TNFalpha + IL-lbeta 


0.0 


LAK cells IL-2 


0.0 


Liver cirrhosis 


100.0 


LAK cells IL-2+1L-12 


0.0 


Lupus kidney 


0.0 


LAK cells IL-2+IFN 
gamma 


0.0 


NCI-H292none 


0 0 


LAK cells IL-2-f-IL-18 ^ 


0,0 


NCI-H292 IL-4 


0.0 


LAK cells 
PMAAionomycin 


0.0 


NCI-H292 IL-9 


u.u 


NK Cells lL-2 rest 


0.0 


TNICI-H292 IL-13 


0.0 


Two Way MLR 3 day 


0.0 


NCI-H292 IFN gamma 


0.0 


Two Way MLR 5 day 


0.0 


HPAEC none 


0.0 


Two Way MLR 7 day 


0.0 


HrALC 1 Nr aipna + IL- 1 
beta 


0.0 


PBMC rest 


0.0 


Lung fibroblast none 


0.0 


PBMC PWM 


0.0 


Lung iiDroDlast INr aipna + 
IL-1 beta 


0.0 


PBMC PHA-L 


•A A 

0,0 


Lung tibroDlast iL-4 


0.0 


ivarnOo t/dl ) IIUJIC 


0.0 


Lung fibroblast IL-9 


0.0 


Ramos (B cell) 

1 rkrinin vf if* 


0.0 


Lung fibroblast IL-1 3 


0.0 


B lymphocytes PWM 


0.0 


Lung fibroblast IFN gamma 


0.0 


B lymphocytes CI>4UL 
and IL-4 


0.0 


Dermal fibroblast CCD1070 

rest 


0.0 


EOL-1 dbcAMP 


0.0 


Dermal fibroblast CCD1070 

1 iNF oipna 


0.0 


EOL-1 dbcAMP 
PMA/ionomycin 


0.0 


IL-1 beta 


0.0 


Etendritic cells none 


0.0 


Dermal fibroblast IFN 
gamma 


u.u 


Dendritic cells LPS 


0.0 


Dermal fibroblast IL-4 


4.2 


Dendritic cells anti- 
CD40 


0.0 


IBD Colitis 2 


0.0 


Monocytes rest 


0.0 


IBD Crohn's 


0.0 


Monocytes LPS 


0.0 


Colon 


1.7 


Macrophages rest 


0.0 


Lung 


9.5 


Macrophages LPS 


0.0 


Thymus 


16.2 


HUVEC none 


0.0 


Kidney 


6.8 


HUVEC starved 


0.0 







CNS_neurodegeneratioii_vl»0 Summary: Ag3038 Expression of the NOVIO gene is 
low/undetectable in all samples on this panel {CTs>35). (Data not shown.) 
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Panel 13D Summary: Ag3038 Expression of the NOV 10 gene is low/undetectable in all 
samples on this panel (CTs>35). (Data not shown.) 

Panel 4D Summary: Ag3038 Significant expression of the NOV 10 gene is restricted to a 
liver cirrhosis sample (CT==34). Therefore, antibodies or small molecule therapeutics 
designed with this gene product may reduce or inhibit fibrosis that occurs in liver cirrhosis. In 
addition, expression of this gene could also be used for the diagnosis of liver cirrhosis. 

a NOVll - CG57024-01: MYELOID UPREGULATED PROTEIN 

Expression of the NOVl 1 gene was assessed using the primer-probe set Ag3064, 
described in Table HA. Results of the RTQ-PCR runs are shown in Table HB. 

Table HA . Probe Name Ag3064 



Primers 


Sequences 


Length 


Start Position 


SEQ ID 
NO: 


Forward 


5-caagtacggtgagcccaaa-3' 


19 


920 


494 


Probe 


TET-5'-ctgtccctgggacaccagctggt-3-TAMRA 


23 


965 


495 


Reverse 


5'-caggttgacgtaggtgaagatg-3' 


22 


994 


496 



Table HB . Panel 4D 



Tissue Name 


Rel. Exp.(%) Ag3064, 
Run 164317426 


Tissue Name 


Rel.Exp.(%)Ag3064, 
Run 164317426 


Secondary Thl act 


0.0 


HUVECIL-ibeta 


0.0 


Secondary Th2 act 


0.0 


HUVEC IFN gamma 


3.3 


Secondary Trl act 


0.0 


HUVEC TNF alpha + IFN 

gamma 


0.0 


Secondary Thl rest 


0.0 ^ 


HUVEC TNF alpha + IL4 


0.0 


Secondary Th2 rest 


0.0 


HUVEC IL-11 


0.0 


Secondary Trl rest 


0.0 


Lung Microvascular EC none 


0.0 


Primary Thl act 


4.5 


Lung Microvascular EC 
TNFalpha + IL-lbeta 


6.6 


Primary Th2 act 


0.0 


Microvascular Dermal EC 
none 


0.0 


Primary Trl act 


0.0 


Microsvasular Dermal EC 
TNFalpha + IL-lbeta 


0.0 


Primary Thl rest 


0.0 


Bronchial epithelium 
TNFalpha + ILlbeta 


0.0 


Primary Th2 rest 


0.0 


Small airway epithelium 

none 


0.0 


Primary Trl rest 


0.0 


Small airway epithelium 
TNFalpha + lL-lbeta 


0.0 


CD45RA CD4 
lymphocyte act 


0.0 


Coronery artery SMC rest 


0.0 
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CD45ROCD4 
lymphocyte act 


0.0 


Coronery artery SMC 
TNFalpha + IL-lbeta 


0.0 


CDS lymphocyte act 


0.0 


Astrocytes rest 


5.7 


*sf*ronfl?irv COR 
lymphocyte rest 


5.3 


Astrocvtes TNFalnha + IL- 
Ibeta 


4.6 


lymphocyte act 


0.0 


KU-812 (Basophil) rest 


6.3 


CD4 lymphocyte none 


0.0 


KU-812 (Basophil) 

PMA/ionomycin 


7,2 


2ry Thl/Th2/Trl anti- 
CD95CH11 


0.0 


CCDl 106 (Keratinocytes) 
none 


0.0 


LAK cells rest 


0.0 


CCD1106 (Keratinocytes) 
TNFalpha-t-IL- Ibeta 


0.0 


LAK cells IL-2 


0.0 


Liver cirrhosis 


29.9 


LAK cells IL-'2+lL-12 


0.0 


Lupus kidney 


4.1 


LAK cells IL-2+IFN 
gamma 


u.u 


XT/^T tlOQO nrvrkift 

iNi^i-rizyz none 




LAK cells IL-2+IL-18 


0.0 


NCI-H292IL-4 


0.0 


LAK cells 
PMA/ionomycin 


A A 

u.u 




A A 


NK Cells IL-2 rest 


0.0 


NCI-H292 IL-13 


0.0 


Two Way MLR 3 day 


0.0 


NCI-H292 IFN gamma 


10.0 


Two Way MLR 5 day 


0.0 


HPAEC none 


0.0 


Two Way MLR 7 day 


0.0 


HPAEC TNF alpha + IL-1 
beta 


13.7 


PBMC rest 


0.0 


Lung fibroblast none 


0.0 


PBMC PWM 


6.0 


Lung fibroblast TNF alpha + 
IL-1 beta 


0.0 


PBMCPHA-L 


0.0 


Lung fibroblast IL-4 


0.0 


Ramos (B cell) none 


0.0 


Lung fibroblast IL-9 


7.2 


Ramos (B cell) 
ionomycin 


4.0 


Lung fibroblast IL-13 


0,0 


B lymphocytes PWM 


0.0 


Lung fibroblast IFN gamma 


0.0 


jl> jyjnpjioL'yics v^jl/huju 
and IL-4 


0.0 


rest 


0.0 


EOL-1 dbcAMP 


6.3 


Dermal fibroblast CCDl 070 
TNF alpha 


0.0 


EOL-1 dbcAMP 
PMA/ionomycin 


5.1 


Dermal fibroblast CCDl 070 
IL-1 beta 


6.0 


OpnHritir nrvnp 


0.0 


Dermal fibroblast IFN 

gamma 


0.0 


Dendritic cells LPS 


0.0 


Dermal fibroblast lL-4 


0.0 


Dendritic cells anti- 

CD40 


0.0 


IBD Colitis 2 


0.0 


Monocytes rest 


0.0 


IBD Crohn's 


0.0 


Monocytes LPS 


5.6 


Colon 


100.0 


Macrophages rest 


0.0 


Lung 


81.8 
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Macrophages LPS 


0.0 


Thymus 


0.0 


HU VEC none 


0.0 


Kidney 


0.0 


HUVEC starved 


0.0 







Panel 13D Sammary: Ag3064 Results from one experiment with the NOVl 1 gene are not 
included. The amp plot indicates that there were experimental difficulties wilh this run. 



Panel 4D Summary: Ag3064 Expression of the NOVl 1 gene is expressed at low levels in 
normal colon and lung (CTs=34.5), and may be useful as a marker for colon and lung tissue. 

L NOV12 - CG57083-01: TESTICULAR SERINE PROTEASE like 

Expression of the NOV 12 gene was assessed using the primer-probe set Ag563, 
described in Table lA. Results of the RTQ-PCR runs are shown in Tables IB, IC, ID, and IE. 



Table lA . Probe Name Ag563 



Primers 


Sequences 


Lengthjstart Position 


SEQID 

NO: 


Forward 


5-gaagatgtctgtgcaccggat-3' 


21 1546 


497 


Probe 


TET-5-cacccatccagactttgagaagctccac-3'-TAMRAl28 1570 


498 


Reverse 


5-catggcaatgtcactcccaa-3' 


20 ^602 ^ 


499 



Table IB . General_screening_j>anel_vl.4 



Tissue Name 


Rel. Exp.(%) Ag563, 
Run 219923406 


Tissue Name 


ReLExp.(%)Ag563, 
Run 219923406 


Adipose 


0.0 


Renal ca. TK-10 


7.7 


Melanoma* 
Hs688(A).T 


0.0 


Bladder 


0.0 


Melanoma* 
Hs688(B).T 


1.8 


Gastric ca. (liver met.) 
NCI-N87 


0.0 


Melanoma* Ml 4 


0.0 


Gastric ca. KATO III 


0.0 


Melanoma* 
LOXIMVI 


0.0 


Colon ca. SW-948 


3.4 


Melanoma* SK-MEL- 
5 


0.0 


Colon ca. SW480 


0.0 


Squamous cell 
carcinoma SCC-4 


0.0 


Colon ca.* (SW480 met) 
SW620 


0.0 


Testis Pool 


0.0 


Colon ca. HT29 


0.0 


Prostate ca.* (bone 
met) PC-3 


0.0 


Colon ca. HCT-116 


3.3 


Prostate Pool 


0.0 


Colon ca. CaCo-2 


0.0 


Placenta 


1.8 


Colon cancer tissue 


0.0 


Uterus Pool 


0.0 


Colon ca.SWl 116 


0.0 


Ovarian ca. OVCAR-'3 


0.0 


Colon ca. Colo-205 


0.0 


Ovarian ca. SK-OV-3 


3.1 


Colon ca. SW-48 


0.0 
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Ovarian ca. OVCAR-4 


0.0 


Colon Pool 


1.6 


Ovarian ca. OVCAR-5 


0.0 


Small Intestine Pool 


63 


Ovarian ca. IGROV-1 


0.0 


Stomach Pool 


1.3 


Ovarian ca OVCAR-8 


0.0 


Bone Marrow Pool 


0.0 


Ovary 


0.0 


Fetal Heart 


0.0 


Breast ca. MCF-7 


0.0 


Heart Pool 


1.9 


Breast ca. MDA-MB- 
231 


u.u 


Lympn iNoue Jrooi 


n A 

U.U 


Breast ca.BT 549 


1.8 


Fetal Skeletal Muscle 


0.0 


Breast ca. T47D 


0.0 


Skeletal Muscle Pool 


0.0 


Breast ca. MOA-N 


0.0 


Spleen Pool 


0.0 


Breast Pool 


3.1 


Thymus Pool 


1.7 


Trachea 


0.0 


cjtno cancer i^giio/asiro^ 
U87-'MG 


0.0 


Lung 


2.2 


CNS> cancer (giio/astro} 
U-118-MG 


2.0 


Fetal Lung 


3.1 


CNS cancer (neuropnet) 

SK-N-AS 


0.0 


Lungca.NCI«N417 


0.0 


CNS cancer Castro'i SF- 
539 


0.0 


Lung ca. LXri 


U.U 


CNS cancer (astro) SNB- 
75 


l.O 


Lung ca. NCI-H146 


0.0 


CNS cancer (glio) SNB- 
19 


0.0 


Lung ca. SHP-77 


100.0 


CNS cancer (glio) SF-295 


1.7 


Lung ca. A549 


0.0 


Brain (Amygdala) Pool 


0.0 


Lungca. NCI-H526 


0.0 


Brain (cerebellum) 


12.3 


Lungca.NCI-H23 


3.7 


Brain (fetal) 


10.0 


Lung ca. iNi^i-ri4ou 


A A 
U.U 


Brain (Hippocampus) 
Pool 


A A 
U.U 


Lungca.HOP-62 


0.0 


Cerebral Cortex Pool 


0.0 


Lungca.NCI-H522 


0.0 


Brain (Substantia nigra) 
Pool 


0.0 


Liver 


0.0 


Brain (Thalamus) Pool 


0.0 


Fetal Liver 


4.0 


Brain (whole) 


0.7 


Liver ca. HepG2 


18.3 


Spinal Cord Pool 


0.0 


Kidney Pool 


1.8 


Adrenal Gland 


0.0 


Fetal Kidney 


0.0 


Pituitary gland Pool 


1.5 


Renal ca. 786-0 


0.0 


Salivary Gland 


0.0 


Renal ca. A498 


7.0 


Thyroid (female) 


0.0 


Renal ca. ACHN 


5.4 


Pancreatic ca. CAPAN2 


0.0 


Renal ca.UO-31 


1.5 


Pancreas Pool 


0.0 



Table IC . Panel 1.1 



Tissue Name 


ReLExp.(%)Ag563, 


Tissue Name 


Rel.E5q).(%)Ag563, 


Run 109491882 


Run 109491882 
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Adrenal gland 


0.0 


Renal ca.UO-31 


5.7 


Bladder 


1.6 


Renal ca.RXF 393 


0.0 


Brain (amygdala) 


0.5 


Liver 


0.0 


Brain (cerebellum) 


15.8 


Liver (fetal) 


0.1 


Brain (hippocampus) 


6.2 


Liver ca. (hepatoblast) 
HepG2 


79.0 


Brain (substantia nigra) 


4.2 


Lung 


0.0 


Brain (thalamus) 


2.3 


Lung (fetal) 


0.0 


Cerebral Cortex 


0.9 


jLung Co. ^non-s.ceii^ 
HOP-62 


2.1 


Brain (fetal) 


6.1 


Lung ca. (large 
cell)NCI-H460 


1.7 


Brain (whole) 


10.5 


Lung ca. (non-s.cell) 
NCI-H23 


3.6 


glio/astroU-118-MG 


2.5 


Lung ca. (non-s.cl) 
NCI-H522 


0.2 


astrocytoma SF-539 


0.1 


Lung ca. (non-sm. cell) 
A549 


7.3 


astrocytoma SNB-75 


2.2 


Lung ca. (s.cell var.) 
SHP-77 


100.0 


astrocytoma SW1783 


0.5 


Lung ca. (small cell) 
T X-1 


2.1 


glioma U251 


13 


T iinor r»fi TqitisiII f*f*ll^ 

NCI-H69 


59.5 


glioma SF-295 


0.1 


Lung ca. (squam.) SW 
900 


2.8 


glioma SNB-19 


3.8 


Lung ca. (squam.) 
NCI-H596 


18.6 


glio/astro U87-MG 


2.0 


Lymph node 


0.0 


neuro*; met SK-N-AS 


1.5 


Spleen 


0.0 


Mammary gland 


0.0 


Thymus 


13.9 


Breast ca. BT-549 


4.5 


Ovary 


0.0 


Breast ca.MDA-N 


4.7 


Ovarian ca. IGROV-1 


9.8 


Breast ca.* (pl.ef) T47D 


6.8 


Ovarian ca. OVCAR-3 


0.3 


Breast ca.* (pLef) MCF- 
7 


1.6 


Ovarian ca. OVCAR-4 


0.0 


Breast ca.* (pl.ef) MDA- 
MB-231 


0.0 


Ovarian ca. OVCAR-5 


27.5 


Small intestine 


0.1 


Ovarian ca. OVCAR-8 


3.1 






Ovarian ca* (ascites) 
SK-OV-3 




Colon ca. HT29 


4.9 


Pancreas 


0.2 


Colon ca. CaCo-2 


0.0 


Pancreatic ca. CAPAN 
2 


0.0 


Colon ca. HCT-15 


13.8 


Pituitary gland 


0.4 


Colon ca.HCT-1 16 


0.1 


Placenta 


0.0 


Colon ca. HCC-2998 


4.7 


Prostate 


0.0 
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Colon ca. SW480 


0.0 


Prostate ca.* (bone 
met) PC-3 


0.0 


Colon ca.*SW620 
(SW480met) 


u.u 






Stomach 


02 


Trachea 


0.0 


Gastric ca. (liver met) 
NCI-N87 


5.2 


Spinal cord 


0.2 


Heart 


0.1 


Testis 


35.6 


Skeletal muscle (Fetal) 


p.o 


Thyroid 


0.2 


Skeletal muscle 


1.1 


Uterus 


0.2 


Endothelial cells 


0.0 


Melanoma M14 


11.4 


Heart (Fetal) 


0.0 


Melanoma LOX IMVI 


0.4 




0.0 


Melanoma UACC-62 


0.0 


Kidney (fetal) 


1.9 


Melanoma SK-MEL- 
28 


0.1 


Renal ca. 786-0 


1.2 


Melanoma* (met) SK- 
MEL-5 


0.0 


Renal ca. A498 


4.1 


Melanoma Hs688(A).T 


0.0 


Renal ca. ACHN 


2.9 


Melanoma* (met) 
Hs688(B).T 


1.8 


Renal ca. TK-10 


6.3 







TableID.Panell.2 



Tissue Name 


Rel. Exp.(%) Ag563, 
Run 117053190 


Tissue Name 


Rel. Exp.(%)Ag563, 
Run 117053190 


Endothelial cells 


0.0 


Renal ca. 786-0 


0.0 


Heart (Fetal) 


0.0 


Renal ca. A498 


0.9 


Pancreas 


0.0 


Renal ca.RXF 393 


0.0 


Pancreatic ca. CAPAN 2 


0.0 


Renal ca ACHN 


0.0 


Adrenal Gland 


0.0 


Renal caUO-31 


0.0 


Thyroid 


0.0 


Renal ca TK-10 


0.0 


Salivary gland 


0.0 


Liver 


0.0 


Pituitary gland 


4.0 


Liver (fetal) 


0.0 


Brain (fetal) 


95.9 


Liver ca. (hepatoblast) 
HepG2 


0.9 


Brain (whole) 


64.6 


Lung 


0.0 


Brain (amygdala) 


3.9 


Lung (fetal) 


0.0 


Brain (cerebellum) 


213 


Lung ca. (small cell) 
LX-T 


0.0 


Brain (hippocampus) 


33.4 


Lung ca. (small cell) 
NCI-H69 


1.1 


Brain (thalamus) 


49.0 


Lung ca. (s.cell var.) 
SHP-77 


4.0 


Cerebral Cortex 


47.6 


Lung ca. (large 
cell)NCI-H460 


0.0 


Spinal cord 


4.7 


Lung ca. (non-sm. cell) 

A549 


0.0 
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glio/astro U87-MG 


0.0 


jLrUng ca, ^non-s.ceii j 

NCI-H23 


2.6 


glio/astroU-118-MG 


0.0 


HOP-62 


0.0 


astrocytoma SW1783 


0.0 


Lung ca. (non-s.cl) 

NCI-H522 


40.6 


neuro*; met SK-N-AS 


0.7 


Lung ca. (squam.) SW 
900 


0.0 


astrocytoma SF-539 


0.0 


Lung ca. (squam.) NCI- 
H596 


0.0 


astrocytoma SNB-75 


0.0 


Mammary gland 


0.0 


glioma SNB-1 9 


0.0 


Breast ca.* (pl.ef) 
MCF-7 


0.0 


glioma U251 


0.0 


Breast ca.* (pl.ef) 
MDA-MB.231 


0.0 


fflioma SF-295 


0.0 


Breast ca.* (ph ef) 
T47D 


0.0 


Heart 


0.0 


Breast ca. BT-549 


0.0 


Skeletal Muscle 


0.0 


Breast ca. MDA-N 


0.0 


Bone marrow 


0.0 


Ovary 


0.2 


Thymus 


0.0 


Ovarian ca. OVCAR-3 


0.0 


Spleen 


0.0 


Ovarian ca. OVCAR-4 


0.0 


Lymph node 


0.0 


Ovarian ca OVCAR-5 


2.7 


Colorectal Tissue 


0.0 


Ovarian ca OVCAR-8 


0.0 


Stomach 


0.0 


Ovarian ca IGROV-1 


0.0 


Small intestine 


0.0 


Ovarian ca ( ascites^ 
SK-OV.3 


0.0 


Colon ca. SW480 


0.0 


Uterus 


0.0 


Colon ca * SW620 
(SW480met) 


0.0 


Placenta 


0.0 


Colon ca. HT29 


0.0 


Prostate 


0.0 


Colon ca HCT-116 


0.0 


Prostate ca.* (bone 
met) PC-3 


0.0 


Colon ca. CaCo-2 


0.0 


Testis 


100.0 


Colon ca Tissue 
(OD03866) 


0.0 


Melanoma Hs688(A).T 


0.0 


Colon ca. HCC-2998 


0.8 


Melanoma* (met) 
Hs688(B).T 


0.0 


Gastric ca,* (liver met) 
NCI-N87 


0.0 


Melanoma UACC-62 


0.0 


Bladder 


0.0 


Melanoma M14 


0.0 


Trachea 


0.0 


Melanoma LOXIM VI 


0.0 


Kidney 


0.0 


Melanoma* (met) SK- 
MEL-5 


0.0 


Kidney (fetal) 


0.0 







Table IE . Panel 4D 
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Tissue Isfame 


ReLExp.(%)Ag563, 
Run 138134496 


Tissue Name 


Rel. Exp.(%)Ag563, 
Run 138134496 


Secondary Thl act 


0.0 NUVECIL-Ibeta 


0.0 


Secondary Th2 act 


0.0 fHUVEC IFN gamma 


0.0 


Secondary Trl act 


^ ^ |HU VEC TNF alpha + IFN 

jgamma 


0.0 


Secondary Thl rest 


0.0 iHUVEC TNF alpha + IL4 i 


21.6 


Secondary Thl rest 


11.0 5HUVECIL-11 


14.0 


Secondary Trl rest 


0.0 


Lung Microvascular EC none 


0.0 


Primary Thl ^ 


10.6 


Lxmg Microvascular EC 
TNFalpha + IL-1 beta 


0.0 


Primary Th2 act 


0.0 


Microvascular Dermal EC 

none 


0.0 


Primary Trl act 


0.0 


Microsvasular Dermal EC 
TNFalpha + IL-lbeta 


0.0 


Primary Thl rest 


0.0 


Bronchial epithelium 
TNFalpha + IL1 beta 


10.4 


Primary Th2 rest 


0.0 


Small airway epithelium 
none 


0.0 


Primary Trl rest 


^ ^ iSmall airway epithelium 
fTNFalpha + IL-Ibeta 


0.0 


CD45RA CD4 

lymphocyte act 


1 

0.0 JCoronery artery SMC rest 

1 


0.0 


CD45RO cm 
lymphocyte act 


^ ^ 'Coronery artery SMC 
|TNFalpha + IL-lbeta 


11.1 


CDS lymphocyte act 


0.0 fAstrocytes rest 


0.0 


oeconuary v_.L/o 
lymphocyte rest 


^ ^ ^Astrocytes TNFalpha + IL- 
pbeta 


0.0 


oeconaary cuo 
lymphocyte act 


11.1 ku>812 (Basophil) rest 


0.0 


CD4 lymphocyte none 


jKU-812 (Basophil) 
jPMA/ionomycin 


0.0 


2ry Thl/Th2/Trl anti- 
CD95 CHll 


^ iCCDl 106 (Keratinocytes) 

Jnone 


0.0 


LAK cells rest 


^ ^ ICCDl 106 (Keratinocytes) 
jTNFalpha + IL-1 beta 


0.0 


LAK cells IL-2 


0.0 iLiver cirrhosis 


100.0 


LAK cells IL-2+IL-12 


0.0 !Lupus kidney 


13.7 


LAK cells IL-2+IFN 

gamma 


0.0 }NCI4i292 none 


1 i ,o 


LAK cells IL«2+IL-18 


0.0 {NCI-H292 IL-4 


0.0 


LAK cells 
PMA/ionomycin 


0.0 jNCI-H292 IL-9 


0.0 


NK Cells IL-2 rest 


0.0 jNCI-H292IL-13 


0.0 


Two Way MLR 3 day 


36.9 INCI-H292 IFN gamma 


0.0 


Two Way MLR 5 day 


0.0 |HPAEC none 


0.0 


Two Way MLR 7 day 


Q ^ IHPAEC TNF alpha + IL-1 

fbeta 


0.0 
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PBMC rest 


0.0 |Lung fibroblast none 


0.0 


PBMCPWM 


^ ^ |Lung fibroblast TNF alpha + 
llL-1 beta 


0.0 


PBMC PHA-L 


12.0 jLung fibroblast IL-4 


0.0 


Ramos (B ceil) none 


0.0 |Lung fibroblast IL-9 


0.0 


Ramos (B cell) 
ionomycin 


0.0 


Lung fibroblast IL-13 


9.7 


B lymphocytes PWM 


10.2 


Lung fibroblast IFN gamma 


0.0 


andIL-4 


0.0 


Dermal fibroblast CCD! 070 

1 y^^t 


11.1 


EOL-1 dbcAMP 


0.0 


r>prmAl fihrnhla<4t CCD 1070 

TNF alpha 


0.0 


EOL-1 dbcAMP 
PMA/ionomycin 


1L9 


Dermal fibroblast CCD1070 
IL-1 beta 


0.0 


Dendritic cells none 


0.0 


Dermal fibroblast IFN 
g^mma 




Dendritic cells LPS 


0.0 


Dermal fibroblast IL-4 


16.4 


Dendritic cells anti- 
CD40 


0.0 


IBD Colitis 2 


0.0 


Monocytes rest 


0.0 


IBD Crohn's 


0.0 


Monocytes LPS 


0.0 


Colon 


61.1 


Macrophages rest 


0.0 


Lung 


43.5 


Macrophages LPS 


0.0 


Thymus 


0.0 


HUVEC none jO.O 


Kidney 


0.0 


HUVEC starved jo.O 







General_screeiiiii^_panel_vl,4 Summary: Ag563 Expression of the NOV12 gene is 
restricted to a sample derived firom a lung cancer cell line (CT=32.6)- Thus, expression of this 
gene could be used to differentiate between this sample and other samples on this panel and 
as a marker to detect the presence of lung cancen Furthermore, therapeutic modulation of the 
expression or function of this gene may be effective in the treatment of lung cancer. 

Panel 1.1 Summary: Ag563 Highest expression of the NOV12 gene is seen in a lung cancer 
cell line (CT==26.7). Significant expression is also seen in clusters of cell line samples derived 
from melanoma, liver cancer, ovarian cancer, renal cancer and colon cancer. Thus, expression 
of the NOV 12 gene could be used to differentiate these samples from other samples on this 
panel and as a marker to detect the presence of these cancers. Furthermore, therapeutic 
modulation of the expression or function of this gene may be effective in the treatment of 
these cancers. 
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While expression of this gene is predominant among cancer cell line samples, 
significant expression is also seen in the testis and the brain. Expression in the testis indicates 
that this gene product may be involved in male fertility. Furthermore, expression in the brain 
indicates that this gene product may be involved in the normal homeostasis of this organ. 

Panel 1.2 Summary: Ag563 Expression of the NOV12 gene in this panel is in agreement 
with the expression seen in the previous panels. Significant expression is seen in testis, a lung 
cancer cell Ime and the brain. Please see Panel 1.1 for discussion of utility of this gene in 
these tissues. 

Panel 4D Summary: Ag563 Significant expression of the NOV12 gene is dietected in a liver 
cirrhosis sample (CT = 32.7). Furthermore, expression of this gene is not detected in normal 
liver in Panels 1.1 and 1 2, suggesting that its expression is unique to liver cirrhosis. 
Therefore, antibodies or small molecule therapeutics designed with the protein encoded by 
this gene could reduce or inhibit fibrosis that occurs in liver cirrhosis. In addition, expression 
of this gene could also be used for the diagnosis of liver cirrhosis. 

Panel 5 Islet Summary: Ag563 Expression of the NOV12 gene is low/undetectable in all 
samples on this panel (CTs>35). (Data not shown.) 

Panel CNS_1 Summary: Ag563 Expression of theNOV12 gene is low/undetectable in all 
samples on this panel (CTs>35). (Data not shown.) 

J. NOV17 - CG57177-01: Carboxypeptidase B 

Expression of NOVl 7 gene was assessed using the primer-probe set Ag4136, 
described in Table JA. Results of the RTQ-PCR runs are shown in Tables JB, JC and JD. 

Table JA . Probe Name Ag4136 



Primers 


Sequences 


LengthjStart Position 


SEQ ID NO: 


Forward 


5*-attgacttctggaagccagatt-3' 


22 |148 


500 


Probe 


TET-5-tgtcacacaaatcaaacctcacagtaca-3-TAMRA 


28 jlTl 


501 


Reverse 


5-cttctgctttaacacggaagtc-3* 


22 1202 


502 



Table JB . CNS_neurodegeneration_vl.0 



Tissue Name 


Rel. Exp.(%) Ag4136, Run 
214961497 


Tissue Name 


Rel. Exp.(%) 
Ag4136,Run 
214961497 


AD 1 Hippo 


24.7 


Control (Path) 3 
Temporal Ctx 


0,0 
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AU L JntlppO 


Ol .1 


Control (Path) 4 
Temporal Ctx 


16.0 


AD 3 Hippo 


0.0 


AD 1 Occipital Ctx 


0.0 


AD 4 Hippo 


9.1 


AD 2 Occipital Ctx 
(Missing) 


0.0 


AD 5 hippo 


6.0 


AD 3 Occipital Ctx 


0.0 


AD 6 Hippo 


34.9 


AD 4 Occipital Ctx 




Control 2 Hippo 


17.0 


AD 5 Occipital Ctx 


25.7 


Control 4 Hippo 


43.8 


AD 6 Occipital Ctx 


16.6 


Control (Path) 3 Hippo 


0.0 


Control 1 Occipital Ctx 


0.0 


AD 1 Temporal Ctx 


0.0 


Control 2 Occipital Ctx 


43.5 


AD 2 Temooral Ctx 


76.8 


Control 3 Occipital Ctx 


12.5 


AD 3 Temporal Ctx 


0.0 


Control 4 Occipital Ctx 


32.8 


AD 4 Temporal Ctx 


22.2 


i^v/iiirui ysT <xwi) 1 

Occipital Ctx 


100.0 


AD 5 Inf Temporal Ctx 


11.9 


Control (Path) 2 
Occipital Ctx 


27.2 


AD 5 SupTemporal Ctx 


41.2 


Control (Path) 3 
Occipital Ctx 


0.0 


AD 6 Inf Temporal Ctx 


48.0 


Control (Path) 4 
Occipital Ctx 


21.3 


AD 6 Sup Temporal Ctx 


0.0 


Control 1 Parietal Ctx 


0.0 


Control 1 Temnoral Ctx 


0.0 


Control 2 Parietal Ctx 


5.5 


Control 2 Temporal Ctx 


44.8 


^Control 3 Parietal Ctx 


35.6 


Control 3 Temporal Ctx 


14.8 


Control fPath^ 1 Parietal 
Ctx 


37.1 


Control 4 Temporal Ctx 


32.8 


Control (Path) 2 Parietal 
Ctx 


11.0 


Control (Path) 1 
Temporal Ctx 


36.3 


Control (Path) 3 Parietal 

Ctx 


0.0 


Control (Path) 2 
Temporal Ctx 


22.4 


Control (Path) 4 Parietal 
Ctx 


0.0 



Table JC . General screeningjpanel vL4 



UssueName 


Rel.Exp.(%)Ag4136, 
Run 220967145 


Tissue Name 


Rel.Exp.(%)Ag4136, 
Run220%7145 


Adipose 


0.0 


Renal ca. TK-10 


0.0 


Melanoma* 
Hs688(A).T 


0.0 


Bladder 


100.0 


Melanoma* 
Hs688(B).T 


0.0 


Gastric ca. (liver met.) 
NCI-N87 


0.0 


Melanoma* MI4 


0.0 


Gastric ca.KATO III 


0.0 


Melanoma* 
LOXIMVI 


0.0 


Colon ca.SW-948 


0.0 


Melanoma* SK-MEL- 

5 


0.0 


Colon ca.SW480 


0.0 


Squamous cell 


0.0 


Colon ca.* (SW480 met) 


0.0 
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carcinoma SCC-4 




SW620 




Testis Pool 


0.0 


Colon ca.HT29 


0.0 


Prostate ca.* (bone 
met) PC-3 


0.0 


Colon ca.HCT-1 16 


0.0 


Prostate Pool 


0.0 


Colon ca. CaCo-2 


A A 
U.U 


Placenta 


A A 


Colon cancer tissue 


n ft 

U.U 


Uterus Pool 


A A 
U.U 


Colon ca. o w 1 1 1 o 


A ft 
U.U 


Ovarian ca. OVCARo 


0.0 


Colon ca. coio-zuD 


A A 
U.U 


Ovarian ca. SK-OV-3 ^ 


0.0 


Colon ca. bW-4o 


A A 
U.U 


Ovarian ca. OVCAR-4 


0.0 


Colon Pool 


0.0 


Ovarian ca. OVCAR-5 


0.0 


Small Intestine Pool 


0.0 


Ovarian ca.IGROV-1 


0.0 


Stomach Pool 


0.0 


Ovarian ca. OVCAR-8 


0.0 


Bone Marrow Pool 


0.0 


Ovary 


0.0 


Fetal Heart 


0.0 


Breast ca. MCF-7 


0.0 


Heart Pool 


0.0 


Breast ca. MDA-MB- 
231 


on 

v.Vr 


T vmrih l^nrfp Pool 


0.0 


Breast ca. BT 549 


0.0 


Fetal Skeletal Muscle 


0.0 


Breast ca. T47D 


0.0 


Skeletal Muscle Pool 


0.0 


Breast ca. MDA-N 


0.0 


Spleen Pool 


0.2 


Breast Pool 


0.0 


Thymus Pool 


0.0 


Trachea 


0.0 


0)*»JS rancer ( f Ho/fi5rtro^ 
U87-MG 


0.0 


Lung 


0.0 


U-118-MG 


0.0 


Fetal Lung 


0.0 


C*NS cancer Tneiiro'niet'^ 
SK-N-AS 


0.0 


Lung ca.NCI-N417 


0.0 


CNS cancer fastro'k SF- 

539 


0.0 


Lung ca. LX- 1 


0.0 


CNS cancer (astro) SNB- 
75 


0.0 


Lungca.NCI-H146 


0.0 


CNS cancer (glio) SNB- 
19 


0.0 


Lun^ca SHP-77 


0.0 


CNS cancer (glio) SF- 
295 


0.0 


Lung ca. A549 


0.0 


Brain (Amygdala) Pool 


0.0 


Lung ca NCI-H526 


0.0 


Brain (cerebellum) 


0.0 


Lung ca. NCI-H23 


0.0 


Brain (fetal) 


0.0 


Lung ca. NCI-H460 


0.0 


Brain (Hippocampus) 
Pool 


0.0 


Lung ca. HOP-62 


0.0 


Cerebral Cortex Pool 


0.0 


Lung ca. NCI-H522 


0.0 


Brain (Substantia nigra) 
Pool 


0.0 


Liver 


0.0 


Brain (Thalamus) Pool 


0.0 


Fetal Liver 


0.3 


Brain (whole) 


0.0 


Liver ca. HepG2 


0.0 


Spinal Cord Pool 


0.0 
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Kidney Pool 


0.0 


Adrenal Gland 


0.5 


Feta] Kidney 


0.0 


Pituitary gland Pool 


0.0 


Renal ca. 786-0 


0.0 


Salivary Gland 


0.0 


Renal ca. A498 


0.0 


Thyroid (female) 


0.0 


Renal ca ACHN 


0.0 


Pancreatic ca. CAPAN2 


0.0 


RenaJ ca.U031 


0.0 


Pancreas Pool 


45.7 



Table JD. Panel 4.1D 





ReLExp.(%)Ag4136, 
Run 173118872 


Ti<?^iie Name 


Rel. Exp.(%) Ag4136, 
Run 173118872 


Secondary Thl act 


0.0 


HUYEC IL-lbeta 


0.0 


Secondary Th2 act 


0.0 


HUVEC IFN gamma 


0.0 


Secondary Trl act 


0.0 


HUVEC TNF alpha + IFN 

gamma 


0.0 


Secondary Thl rest 


0.0 


HUVEC TNF alpha + IL4 


0.0 


Secondary Th2 rest 


0.0 


HUVEC IL-11 


1.8 


^*»f»r*r»H airv Tfl r^ct 

oevUiiuaiy in rcoi 


i/.i/ 


T liner ^^if*t*f*vfi^f*iilfir Ff^ nfiiip 


0.0 


Primary Thl act 


0.0 


T liner \yfii~frv\/5icr'ii1jit* T^^^ 
X^Ull^ IVllL/li/Vtlol.'lJlal 

TNFalpha+ IL-lbeta 


0.0 


Primary Th2 act 


0.0 


none 


0.0 


Primary Trl act 


0.0 


TNFalpha + IL-lbeta 


0.0 


Primary Thl rest 


0.0 


Rronchial enithelium 
TNFalpha+ILlbeta 


0.0 


Primary Th2 rest 


0.0 


Small airwav enithelium 

none 


0.0 


Primary Trl rest 


0.0 


Small airway epithelium 
TNFalpha+ IL-lbeta 


0.0 


CD45RA CD4 
lymphocyte act 


u.u 


Coronery artery SMC rest 


u.u 


CD45RO CD4 
lymphocyte act 


0.0 


Coronery artery SMC 
TNFalpha + IL-lbeta 


0.0 


CDS lymphocyte act 


0.0 


Astrocytes rest 


0.0 


Secondary CDS 
lymphocyte rest 


0.0 


Astrocytes TNFalpha + IL- 
Ibeta 


0.5 


Secondary CDS 
lymphocyte act 


0.0 


KU-812 (Basophil) rest 


27.9 


CD4 lymphocyte none 


0.0 


KU-812 (Basophil) 
PMA/ionomycin 


62.0 


2ryThl/Th2/Trl miti- 
CD95 CHll 


0.0 


CCDl 106 (Keratinocytes) 
none 


0.0 


LAK cells rest 


03 


CCDl 106 (Keratinocytes) 
TNFalpha + IL-lbeta 


0.0 


LAK cells IL-2 


0.5 


Liver cirrhosis 


0.0 


LAK cells IL-2+IL-12 


0.0 


NCI-H292 none 


0.0 


LAK cells IL-2+IFN 


0.0 


NCI-H292 IL-4 


0.1 
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gamma 


1 




LAK cells IL-2+IL-18 


0.0 


NCI-H292IL-9 


0.5 


LAK cells 
PMA/ionomycin 


0.0 


NCI-H292 IL-13 


0.0 


NK Cells IL-2 rest 


0.0 


NCI-H292 IFN gamma 


0.0 


Two Way MLR 3 day 


0.4 


HPAEC none 


0.0 


Two Way MLK d day 


0.0 


HPAEC TNF alpha + IL-1 
beta 


0.5 


Two Way MLR 7 day 


0.0 


Lung fibroblast none 


2.0 


PBMC rest 


0.0 


IL-1 beta 


0.5 


PBMCPWM 


A A 

u.u 


Jbung iiorouiasi ili-ht 


0.0 


PBMC PHA-L 


0.0 


Lung fibroblast IL-9 


0.0 


Ramos (B cell) none 


0.0 


Lung fibroblast IL-13 


1.1 


Ramos {d CQii) 
ionomycin 


0.0 


Lung fibroblast IFN gamma 


0.0 


B lymphocytes PWM 


0.0 


Dermal fibroblast CCD1070 
rest 


0.0 


and IL-4 


0.0 


Dermal fibroblast CCD1070 
IJNr alpna 


0.0 


EOL-1 dbcAMP 


0.0 


i/ennai iioroDiast cciyiu / v 
IL-1 beta 


0.0 


EOL-1 dbcAMP 

PMA/ionomycin 


0.0 


Dermal fibroblast IFN 

gamma 




Dendritic cells none 


0.0 


Dermal fibroblast IL-4 


0.0 


Dendritic cells LPS 


2.0 


Dermal Fibroblasts rest 


1.3 


Dendritic cells anti- 
CD40 


0.0 


Neutrophils TNFa+LPS 


0.4 


Monocytes rest 


0.0 


Neutrophils rest 


0.7 


Monocytes LPS 


0.0 


Colon 


0.9 


Macrophages rest 


0.0 


Lung 


4.5 


Macrophages LPS 


0.0 


Thymus 


19.1 


HUVEC none 


0.0 


Kidney 


100.0 


HUVEC starved 


0.0 







CNS_neurodegeneratioii_vl.O Summary: Ag4136 Expression levels of the NOV17 gene 
in the brain are very low. No disease association is evident by this panel. Carboxypeptidase B 



is, however, a known mediator of beta-amyloid clearance in the brain, and consequently plays 
an important role in Alzheimer's disease. Therefore, even low expression of the NOV17 gene 
may be sufficient to impart significant beta amyloid clearance, especially over time. 
Therefore, agents that augment the function of this gene product may have utility as 
therapeutics in the treatment of Alzheimer's disease. 

References: 
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Matsumoto A, Itoh K, Seki T, Motozaki K, Matsuyama S. Human brain 
carboxypeptidase which cleaves beta-amyloid peptides in vitro, is expressed in the 
endoplasmic reticulum of neurons. Eur J Neurosci 2001 May; 13(9): 1653-7 

Intracellular localization of novel human brain carboxypeptidase B (HBCPB) was 
investigated in human hippocampus, using immunohistochemistry by confocal laser 
microscopy and biochemical purification of the homogenate by density gradient 
ultaicentrifugation. TTie former revealed that the majority of HBCPB was expressed in the 
endoplasmic reticulum, in which the HBCPB-specific C14-module immunoreactivity was 
colocalized with GRP78 immunoreactivity, a stress 70 heat shock protein specifically 
expressed in the endoplasmic reticulum. The latter showed that anti-C14-module 
immunoreactivity and prepro-HBCPB immunoreactivity were both enriched in the 
microsome fraction, especially in that of the endoplasmic reticulum-density fraction of 
normal human hippocampal homogenates from various sources. However, HBCPB prepared 
from human hippocampus showed exopeptidase activity for synthetic beta-amyloid 1-42 
peptide, in which Abeta X-42 C-terminus immunoreactivity was decreased in a fashion dose- 
dependent of the amount of the protease added. These findings indicate that HBCPB, which 
is expressed in the endoplasmic reticulum of a group of neuronal perikarya, may play an 
important physiological role in degradation of beta-amyloid 1-42, which is specifically 
generated in the endoplasmic reticulum of human and rodent neurons and is also regarded as 
the most pathogenic and aggregatable species among all beta-amyloid peptides. 

General_screening_j)anel_vl.4 Summary: Ag4136 Significant expression of the NOV 17 
gene, a carboxypeptidase B homolog, is restricted to pancreas and bladder (CTs=20-22). 
Thus, expression of this gene could be used to differentiate between these samples and other 
samples on this panel and as a marker of these tissues. 

Panel 4.1D Summary: Ag41 36 Expression of the NOV17 gene, a carboxypeptidase B 
homolog is limited to a few samples, with highest expression in the kidney (CT=29.6). 
Therefore, antibody or small molecule therapies designed with the protein encoded for by this 
gene could modulate kidney function and be important in the treatment of inflammatory or 
autoimmune diseases that affect the kidney, including lupus and glomerulonephritis. The 
NOV 17 gene is also expressed at moderate levels in KU-812 basophil cells treated with 
PMA/ionomycin and at lower levels in untreated basophils. These cells are a reasonable 
model for the inflammatory cells that take part in various inflammatory lung and bowel 
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diseases, such as asthma, Crohn's disease, and ulcerative colitis. Therefore, therapeutic 
modulation of the expression or function of this gene may also be effective in the treatment of 
these diseases. 

K. NOV5 - CG57081-01: serine/threonine kinase 

Expression of the NOV5 gene was assessed using the primer-probe set Ag3072, 
described in Table KA. 

Table KA . Probe Name Ag3072 



Primers 


Sequences 


Length 


Start Position 


SEQID 
NO: 


Forward 


5*-tggtggtagacctgcttctg-3' 


20 


536 


503 


Probe 


TET-5*-gacctacgttaccacctgcagcagaa-3-TAMRA 


26 


562 


504 


Reverse 


5-ctcactgtgtcctcggagaa-3* 


20 


595 


505 



CNS_neurodegeneration_vl.O Summary: Ag3072 Expression of the NOV5 gene is 
low/undetectable in all samples on this panel (CTs>35). (Data not shown.) 

Panel 1.3D Summary: Ag3072 Expression of the NOV5 gene is low/undetectable in all 
samples on this panel {CTs>35). (Data not shown.) 

Panel 2.2 Summary: Ag3072 Expression of the NOV5 gene is low/undetectable in all 
samples on this panel (CTs>35). (Data not shown.) 

Panel 4D Summary: Ag3072 Expression of the NOV5 gene is low/undetectable in all 
samples on this panel (CTs>35). (Data not shown.) 

NOV37 - CG57335-01: Protocadherin beta 3 

Expression of the NOV37 gene was assessed using the primer-probe set Ag3192, 
described in Table LA. Results of the RTQ-PCR runs are shown in Tables LB and LC. 

Table LA . Probe Name Ag3 1 92 



Primers 


Sequences 


Length 


Start PositionlSEQ ID NO: 


Forward 


5-ctggtacggattgaagttgtg-3* 


21 


1107 |506 


Probe 


TET-5-catcaatgacaacgtcccagagtt-3*-TAMRA 


24 


1130 |507 


Reverse 


5-gttccaatgtctaaatccctg-3* 


21 


1226 |508 



Table LB. Panel L3D 



Tissue Name i^^f l^'^ 
iAg3192,Run 


ReL Exp.(%) 
Ag3192,Run 


Tissue Name 


Rel. Exp.(%) iReL Exp.(%) 
Ag3192,Run iAg3192,Run 
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165673603 


167994640 




165673603 |l 67994640 


Liver 

adenocarcinoma 


59,0 


19.6 


Kidney (fetal) 


8.7 


6.2 


Pancreas 


2.9 


0.0 


Renal ca. 786-0 


11.7 


1.5 


Pancreatic ca. 
CAPAN2 


0.0 


0.0 


Renal ca. A498 


24.8 


4.7 


Adrenal gland 


2.2 


0.0 


393 


0.0 


0.0 


Thyroid 


3.2 


03 


Renal ca. 
ACHN 


0.0 


0.0 


Salivary gland 


2.0 


0.5 


Renal ca. UO- 
31 


14.0 


1.9 


Pituitary gjand 


12.6 


2.0 


Renal ca. TK- 
10 


10.1 


92 






8.1 


Liver 


3.9 


0.0 


Brain (whole) 


173 


3.7 


Liver (fetal) 


0.0 


0.0 


Brain (amygdala) 


23.8 


2.6 


Liver ca. 

(hepatoblast) 

HeDG2 


24.3 


4.4 


Brain (cerebellum) 


40.6 


3.2 


Lung 


10.3 


0.8 


Brain 

(hippocampus) 


15.5 


2.9 


Lung (fetal) 


9.2 


3.1 


Brain (substantia 

nigra) 


3.1 


1.7 


Lung ca. (small 
cell) LX-1 


0.0 


0.0 


Brain (thalamus) 


12.7 


1.8 


Lung ca. (small 
cell)NCI-H69 


8.8 


2.0 


Cerebral Cortex 


26.4 


5.1 


Lung ca. (s.cell 
var.) SHP-77 


100.0 


100.0 


Spinal cord 


12.9 


1.7 


Lung ca. (large 
cen)NCI-H460 


52.9 


0.0 


glio/astro U87-MG 


4.1 


1.1 


Lung ca. (non- 
sm. cell) A549 


42 


6.6 


glio/astroU-118- 
MG 


40.9 


4.0 


Lung ca. (non- 

S.ceHj iMd-riZ3 


57.4 


7.7 


astrocytoma 
SW1783 


31.6 


6.0 


Lung ca. (non- 
s.cell) HOF-62 


u.u 


n n 


neuro*; met SK-N- 
AS 


0.0 


0.0 


Lung ca. (non- 
s.cl) NCI-H522 






astrocytoma SF-539 


543 


14.8 


Lung ca. 
(squam.) SW 

QAA 


33 


03 


astrocytoma SNB- 
75 


21.2 


2.4 


(squam.) NCI- 
H596 


3.4 


2.4 


glioma SNB-1 9 


98.6 


17.8 


Mammary 
gland 


12.8 


2.2 


glioma U251 


55.9 


6.0 


Breast ca.* 
(pl.ef) MCF-7 


27.0 


7.6 


glioma SF-295 


6.3 


4.4 


Breast ca.* 


0.0 


0.0 
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MB-231 






Heart (fetal) 


0.0 


1.0 


Breast ca.* 
(pl.ef) T47D 


24.8 


32.8 


Heart 


6.0 


1.1 


Breast ca. BT- 
549 


75.3 


3.4 


Skeletal muscle 
(fetal) 


10.6 


1.2 


Breast ca. 
MDA-N 


27.0 


8.1 


Skeletal muscle 


2.8 


0.3 


Ovary 


10.5 


22 


Bone marrow 


0.0 


0.0 


OVCAR-3 


15.2 


1.4 


Thymus 


1.2 


0.4 


vy Vcu liill v.'Ct. 

OVCAR-4 


18.8 


3.8 


Spleen 


1.5 


03 


Ovarian ca. 

U V I^AKO 


0.0 


0.0 


Lymph node 


9.3 


0.0 


Vy VoLTiall K^cL 

OVCAR-8 


0.0 


0.7 


Colorectal 


1.8 


0.4 


Ovarian ca. 
IGROV~l 


8.5 


3.5 


Stomach 


5.4 


0.0 


i^/varian ca. 
(ascites) SK- 
OV-3 


56.6 


71.7 


Small intestine 


13.2 


0.9 


Uterus 


31.9 


1.0 


Colon ca. SW480 


0.0 


0.0 


Placenta 


0.0 


0.0 


Colon ca.* 

SW620(SW480 

met) 


0.0 


1.5 


Prostate 


5.4 


0.4 


Colon ca. HT29 


0.0 


0.0 


rTosiaie ca. 
(bone met)PC-3 


22.4 


12.6 


Colon ca. HC 1 - 1 1 o 






1 esiis 




.>.o 


Colon ca. CaCo-2 


0.0 


0.0 


Melanoma 

nbOoo\^/\ J, I 


2.4 


1.3 


Colon ca. 
tissue(OD03866) 


in 1^ 


ft ^ 


Melanoma* 
Hs688(B).T 


1 Q 




Colon ca. HCC- 


1.5 


0.5 


Melanoma 
\lACC-fi2 


30.8 


11.8 


met)NCI-N87 


81.8 


8.2 


Melanoma M14 


0.0 


0.0 


Bladder 


4.8 


1.0 


Melanoma 
LOXIMVI 


10.0 


14.3 


Trachea 


4.5 


0.6 


Melanoma* 
(met) SK-MEL- 
5 


0.0 


0.9 


Kidney 


2.2 


0.8 


Adipose 


4.6 


1.6 



Table LC . Panel 4D 



Tissue Name 


Rel.Exp.(%)Ag3192, 
Run 164389283 


Tissue Name 


Rel.Exp.(%)Ag3192, 
Run 164389283 
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Secondary Thl act 


0.0 


HUVEC IL-lbeta ! 


0.0 


Secondary Th2 act 


0.0 


HUVEC IFN gamma 


0.0 


Secondary Trl act 


0.0 


HUVEC TNF alpha + IFN 
gamma 


0.0 


Secondary Thl rest 


0.0 


HUVEC TNF alpha + IL4 


0.0 


Secondary Th2 rest 


0.0 


HUVEC IL-11 


0.8 


Secondary Trl rest 


0.0 


Lung Microvascular EC none 


0.8 


Primary Thl act 


0.0 


Lj UI IVl id U V oo^ Ui cU C/K^ 

TNFalpha + IL-lbeta 


1.4 


Primary Th2 act 


0.0 


iviicrovabLUJar i^'criiicu xiv^ 
none 


1.6 


Primary Trl act 


0.0 


IVIlCrOSVcloUJar LJCimal CtK^ 

TNFalpha-f IL-lbeta 


0.0 


Primary Thl rest 


0.0 


oroncniai epimeiium 
TNFaliAa + ILlbeta 


1.6 


Primary Th2 rest 


0.0 


omaii airway cpiineiium 
none 


0.0 


Primary Trl rest 


0.0 


Small airway epithelium 
TNFalpha + IL-lbeta 


12.8 


CD45RA CD4 
Ivmnhocvte act 


0.7 


Coronery artery SMC rest 


2.5 


CD45RO CD4 

Ivmnliocvtf* act 


0.0 


Coronery artery SMC 
TNFalpha + IL-lbeta 


0.0 


CDS lymphocyte act 


0.0 


Astrocytes rest 


47.0 


Secondary CL)o 
lymphocyte rest 


0.0 


Asirocyies i iNraipna ^ ijl- 
Ibeta 


39.0 


oeconuary k^ljo 
Ivmnhocvte act 


0.0 


KU-812 (Basophil) rest 


12.9 


CD4 lymphocyte none 


0.0 


KU-8 12 (Basophil) 
PMA/ionomycin 


25.3 


2ry Thl/Th2/Trl anti- 
CD95CH11 


0.0 


CCDl 106 (Keratinocytes) 
none 


0.5 


LAK cells rest 


0.0 


CCDl 1 06 (Keratinocytes) 
TNFalpha + IL-lbeta 


1.3 


LAK cells IL-2 


0.5 


Liver cirrhosis 


1.0 


LAK cells 1L-2+IL-12 


0.0 


Lupus kidney 


1.5 


LAK cells 1L^2+IFN 
gamma 


u.u 


iNCi-rizyz none 


oU.l 


LAK cells IL-2+IL-18 


0.0 


NCI-H292 IL-4 


70.2 


LAK cells 
PMA/ionomycin 


0 0 




100.0 


NK Cells IL-2 rest 


0.6 


NC1-H292 IL-13 


47.0 


Two Way MLR 3 day 


0.0 


NCI-H292 IFN gamma 


40.1 


Two Way MLR 5 day 


0.0 


HPAEC none 


2.1 


Two Way MLR 7 day 


0.0 


HPAEC TNF alpha + IL-1 
beta 


2.7 


PBMC rest 


0.5 


Lung fibroblast none 


10.2 
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PBMC PWM 


0.0 


Lung fibroblast TNF alpha + 
IL-1 beta 


2.8 


PBMC PHA-L 


0.0 


Lung fibroblast IL-4 


13.8 


Ramos (B cell) none 


0.0 


Lung fibroblast IL-9 


7.0 


Ramos (B cell) 
ionomvcin 


0.0 


Lung fibroblast IL-1 3 


8.4 


B lymphocytes PWM 


0.0 


Lung fibroblast IFN gamma 


15.2 


B lympnocytes CLHUL 
arifl IT —4. 


0.0 


j^rmai liDroDiasi KA^uiv/xj 
rest 


3.5 


EOL-1 dbcAMP 


0.0 


Dermal fibroblast CCD1070 
TNF alpha 


4.3 


PMA/ionomycin 


0.0 


Dermal fibroblast CCD1070 
IL-1 beta 


3.0 


Dendritic cells none 


0.0 


Dermal fibroblast IFN 

gamma 


1.5 


Dendritic cells LPS 


0.0 


Dermal fibroblast IL-4 


2.3 


Dendritic cells anti- 
CD40 


0.0 


IBD Colitis 2 


0.0 


Monocytes rest 


0.0 


IBD Crohn's 


0.0 


Monocytes LPS 


0.0 


Colon 


4.5 


Macrophages rest 


0.0 


Lung 


19.6 


Macrophages LPS 


0.0 


Thymus 


53 


HUVEC none 


0.7 


Kidney 


2.2 


HUVEC starved 


L4 







Panel 1.3D Summary: Ag3192 Two experiments with the same probe and primer set 
produce results that are in reasonable agreement, with highest expression in a lung cancer cell 
line (CTs-31-33). 

Significant levels of expression are also seen in cell lines derived from liver, ovarian, 
breast, gastric, and brain cancers. Thus, expression of the NOV37 gene could be used to 
differentiate between these samples and other samples on this panel and as a marker to detect 
the presence of these cancers. Furthermore, therapeutic modulation of the expression or 
function of the NOV37 gene may be effective in the treatment of liver, ovarian, breast, 
gastric, and brain cancers. 

In addition, the NOV37 gene, a protocadherin homolog, is detected at low levels in 

the CNS; levels are highest in the cerebellum. The cadherins have been shown to be critical 

for CNS development, specifically for the guidance of axons, dendrites and/or growth cones 

in general. Therapeutic modulation of the levels of this protein, or possible signaling via this 

protein may be of utility in enhancing/directing compensatory synaptogenesis and fiber 

growth in the CNS in response to neuronal death (stroke, head trauma), axon lesion (^inal 
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cord injury), or neurodegeneration (Alzheimer's, Parkinson's, Huntington's, vascular dementia 
or any neurodegenerative disease). Since protocadherins play an important role in 
synaptogenesis this gene product may also be involved in depression, schizophrenia, which 
also involve synaptogeneisis. Because this cadherin shows highest expression in the 
cerebellum, making it an excellent candidate for the spinocerebellar ataxias as well. 

References: 

Hilschmann N, Bamikol HU, Bamikol-Watanabe S, Gotz H, Kratzin H, Thinnes FP. 
The immunoglobulin-like genetic predetermination of the brain: the protocadherins, blueprint 
of the neuronal network. Naturwissenschaften 2001 Jan;88(l):2-12 

The morphogenesis of the brain is governed by synaptogenesis. Synaptogenesis in 
turn is determined by cell adhesion molecules, which bridge the synaptic cleft and, by 
homophilic contact, decide which neurons are connected and which are not. Because of their 
enormous diversification in specificities, protocadherins (pcdh alpha, pcdh beta, pcdh 
gamma), a new class of cadherins, play a decisive role. Surprisingly, the genetic control of 
the protocadherins is very similar to that of the immunoglobulins. There are three sets of 
variable (V) genes followed by a corresponding constant (C) gene. Applying the rules of the 
immunoglobulin genes to the protocadherin genes leads, despite of this similarity, to quite 
different results in the central nervous system. The lymphocyte expresses one single receptor 
molecule specifically directed against an outside stimulus. In contrast, there are three specific 
recognition sites in each neuron, each expressing a different protocadherin. In this way, 4,950 
different neurons arising from one stem cell form a neuronal network, in which homophilic 
contacts can be formed in 52 layers, permitting an enormous number of different connections 
and restraints between neurons. This network is one module of the central computer of the 
brain. Since the V-genes are generated during evolution and V-gene translocation during 
embryogenesis, outside stimuli have no influence on this network. The network is an inborn 
property of the protocadherin genes. Every circuit produced, as well as learning and memory, 
has to be based on this genetically predetermined network. This network is so universal that it 
can cope with everything, even the unexpected. In this respect the neuronal network 
resembles the recognition sites of the immunoglobulins. 

Panel 2*2 Summary: Ag3192 Expression of the NOV37 gene is low/undetectable in all 
samples on this panel (CTs>35). (Data not shown.) 
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Panel 4D Stiminary: Ag3192 The NOV37 transcript is expressed in NCI-H292 cells. 
Treatment of these cells does not seem to significantly alter expression of this transcript in 
this muco-epidermoid cell line. Thus, the protein could be used to identify certain lung 
tumors similar to NCI-H292, consistent with panel 1 .3. The encoded protein may also 
contribute to the normal function of the goblet cells within the lung. Therefore, designing 
therapeutics to this protein may be important for the treament of emphysema and asthma as 
well as other lung diseases in which goblet cells or the mucus they produce have pathological 
consequences. 

Panel CNSJ Summary: Ag3192 Expression of the NOV37 gene is low/undetectable in all 
samples on this panel {CTs>35). (Data not shown.) 

M. M28130: Sequence from Methods of Use for InterIeakin-8 (IL-8) and Anti-IL-8 
Antibodies Patent 

Expression of gene IL-8 (GenBank Accesion No.M28130) was assessed using the 
primer-probe set Agl016, described in Table MA. Results of the RTQ-PCR runs are shown 
in Tables MB, MC and MD. 



Table MA. Probe Name Agl 016 



Primers 


Sequences 


Length 


Start Position 


SEQ 
ID 

NO: 


Forward 
Probe 


5 ' -att:gcacgggagaatatacaaa-3 ' 


22 


673 


;509 


TET-5 * -ccaagggccaagagaatatccgaact-3 ' -TAMRA 


26 


708 


,510 


Reverse 


5 • -tcacattctagcaaacccattc-3 • 


22 


749 


511 



Table MB. AI_comprehensive panel_yl.O 



Tissue Nfflne 


Rel. Exp.(%) 
Agl 016, Run 
211059879 


Rel. Exp.(%) 
Agl016, Run 
212309939 


Tissue Name 


Rel. Exp.(%) 
Agl 016, Run 
211059879 


Rel. Ejq).(%) 
Agl 016, Run 
212309939 


1 10967 COPD-F 


0.4 


0.2 


11 2427 Match 

Control 

Psoriasis-F 


0.1 


0.1 


1 10980 COPD-F 


0.1 


0.1 


112418 
Psoriasis-M 


0.5 


0.6 


1 10968 COPD- 
M 


0.1 


0.3 


1 12723 Match 

Control 

Psoriasis-M 


0.0 


0.0 


110977COPD- 
M 


0.4 


0.4 


112419 
Psoriasis-M 


1.6 


1.9 



110989 
Emphysema-F 


0.0 


0.1 


112424 Match 

Control 

Psoriasis-M 


0.1 


0.1 


110992 
Emphysema-F 


0.0 


0.1 


112420 

Psoriasis-M 


0.5 


0.6 


110993 

Emphysema-F 


0.0 


0.0 


112425 Match 
Control 

Psoriasis-M 


0.1 


0.1 


110994 
Emphysema-F 


0.0 


0.0 


1 04689 (MF) 
OA Bone- 
Backus 


0.2 


0.1 


1 10995 

Emphysema-F 


0.0 


0.1 


1 04690 (MF) 
Adj "Normal" 

Bone-Backus 


0.4 


0.2 


110996 

Emphysema-F 


0.0 


0.1 


104691 (MF) 
OA Synovium- 
Backus 


0.1 


0.0 


110997 Asthma- 
M 


1.0 


1.1 


104692 (BA) 
OA Cartilage- 
Backus 


0.1 


0.1 


111001 Asthma- 
F 


0.1 


0.0 


1 04694 (B A) 
OA Bone- 
Backus 


0.3 


0.2 


111002Asthma- 
F 


0.1 


0.0 


104695 (BA) 
Adj "Normal" 
Bone-Backus 


0.0 


0.1 


111003 Atopic 
Asthma-F 


0.1 


0.0 


104696 (BA) 
OA Synovium- 
Backus 


0.8 


1.0 


11 1004 Atopic 
Asthma-F 


0.2 


0.2 


104700 (SS) OA 
Bone-Backus 


13 


2.6 


111005 Atopic 
Asthma-F 


0.1 


0.1 


104701 (SS) Adj 
"Normal" Bone- 


0.3 


0.3 


11 1006 Atopic 
Asthma-F 


u.u 


u.u 


104702 (SS) OA 
oy no V 1 um- 
Backus 


0.7 


!o.6 

I . . 


11 1417 Allergy- 

M 


0.0 


0.2 


117093 OA 

v^diu lage ivcp / 


0.5 




0.6 


112347 Allergy- 

M 


0.0 


0.0 


1 12672 OA 


i 

0.2 l0.2 


112349 Normal 
Lung~F 


0.0 


0.0 


112673 OA 

^vnn V i i inn S 


0.1 jo.i 

_ _ _ _ .... .1 ^ 


112357 Normal 
Lung-F 


0.8 


0.4 


112674 OA 
Synovial Fluid 
cells5 


0.2 


1 

0.1 


112354 Normal 
Lung-M 


0.4 


0.7 


117100 OA 
Cartilage Repl4 


0.0 


0.0 


1 12374 Crohns- 
F 


0.1 


0.1 


112756 OA 
Bone9 


2.5 


2.2 
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112389 Match 
Control Crohns- 
F 


0.1 


0.0 


112757 OA 
Synovium9 


0.0 


0.1 


1 12375 Crohns- 
F 


0.0 


0.0 


112758 OA 
Synovial Fluid 
Cells9 


0.1 


0.1 


112732 Match 
Control Crohns- 
r 


1.1 


0.9 


1 17125 RA 
Cartilage Rep2 


0.1 


0.0 


1 12725 Crohns- 
M 


0.0 


0.1 


1 13492 Bone2 
RA 


6.0 


72 


112387 Match 
Control Crohns- 
M 


0.0 


0.0 


113493 

SynoviuniZ RA 


1.7 


1.5 


112378 Crohns- 
M 


0.0 


0.0 


1 13494 Syn 
Fluid Cells RA 


3.4 


3.6 


112390 Match 
Control Crohns- 
M 


0.0 


0.0 


113499 

Cartilage4 RA 


5.6 


4.9 


112726Crohns- 
M 


0.2 


0.2 


1 13500 Bone4 
RA 


6.1 


4.4 


112731 Match 
Control Crohns- 
M 


0.4 


0.4 


113501 

Synovium4 RA 


33 


4.3 


112380 Ulcer 
Col-F 


0.0 


0.2 


113502 Syn 
Fluid Cells4 RA 


3.1 


2.5 


112734 Match 
Contro] Ulcer 
Col~F 


100.0 


100.0 


113495 

Cartilage3 RA 


3.0 


3.0 


112384 Ulcer 
Col-F 


0.3 


0.2 


1 13496 Bone3 
RA 


3.2 


3.7 


112737 Match 
Control Ulcer 
Col-F 


0.1 


0.5 


113497 

Synovium3 RA 


L7 


2.1 


112386 Ulcer 
Col-F 


0.0 


0.1 


113498 Syn 
Fluid Cells3RA 


4.8 


4.1 


112738 Match 
Control Ulcer 

Col-F 


2.3 


2.6 


117106 Normal 
Cartilage Rep20 


0.1 


0.0 


112381 Ulcer 
Col-M 


0.0 


0.1 


1 13663 Bone3 
Normal 


0.0 


0.0 


112735 Match 
Control Ulcer 
Col-M 


0.1 


0.5 


113664 

Synovium3 

Normal 


0.0 


U.U 


112382 Ulcer 
Col-M 


0.4 


0.3 


113665 Syn 
Fluid Ceils3 
Normal 


0.0 


0.0 


11 2394 Match 
Control Ulcer 
Col-M 


0.0 


0.0 


117107 Normal 
Cartil^e Rep22 


0.0 


0.0 
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112383 Ulcer 
Col-M 


0.8 


0.7 


1 1 3oo / i5one4 
Nonnal 


0.1 


0.1 


112736 Match 
Control Ulcer 
Col-M 


0.1 


0.0 


1 13668 

Synovium4 

Normal 


0.1 


0.1 


112423 
Psoriasis-F 


7.1 


7.6 


113669 Syn 
Fluid CeHs4 
Nonnal 


0.2 


02 



Table MC , General_screeningj)anel_yL4 



Tissue Name 


Rei:Exp.(%) 
Agl016,Run 
208030041 


Rel. Exp.(%) 
Agl016, Run 
212141063 


Tissue Name 


Rel. Exp.(%) ; 

Agl016,Run 

208030041 


Rel. Exp.(%) 
Agl016,Run 
212141063 


Adipose 


2.5 


3.4 


Renal ca. TK-10 


8.9 


11.7 


Melanoma* 
Hs688(A).T 


0.1 


0.0 


Bladder 


1.6 


4.7 


Melanoma* 
Hs688(B).T 


0.1 


0.1 


Gastric ca. (liver 
met.)NCI-N87 


0.2 


0.5 


Melanoma* 
M14 


0.0 


0.0 


Gastric ca. KATO 
III 


3.1 


10.7 


Melanoma* 
LOXIMVI 


46.0 


54.0 


Colon ca. SW-948 


2.3 


8.8 


Melanoma* 
SK-MEL-5 


0.3 


0.7 


Colon ca. SW480 

, ,„ 


0.0 


0.0 


Squamous cell 

carcinoma 

SCC-4 


6.1 


12.6 


Colon ca.* 

(SW480met) 

SW620 


A O 
U.O 


A ^2 


Testis Pool 


0.0 


0-1 


Colon ca. HT29 


A 1 


A 1 
U.l 


Prostate ca.* 
(bone met) PC- 


2.1 


2.5 


Colon ca. HCT- 
116 


u.u 


n 1 
u.l 


Prostate Pool 


0.2 


0.2 jColon ca. CaCo-2 


0.0 


0.0 


Placenta 


0.0 


^ ^ JColon cancer 

Itissue 


100.0 


100.0 


Uterus Pool 


0.0 


0.0 


Colon ca. SW1I16 


0.0 


0.0 


Ovarian ca. 
OVCAR-3 


0.1 


0.4 


Colon ca. Colo- 
205 


0.0 


0.0 


Ovarian ca. 
SK-OV-3 


1.3 


5.7 


Colon ca. SW-48 


0.0 


0.0 


Ovarian ca. 
OVCAR-4 


1.0 


2.8 


Colon Pool 


0.0 


0.0 


Ovarian ca. 
OVCAR-5 


0.1 


^ - ^Small Intestine 
iPool 


0.0 


0.1 


Ovarian ca. 
IGROV-1 


0.0 


i 

0.1 jStomachPool 


0.2 


1.3 


Ovarian ca. 
OVCAR-8 


0.1 


^ 2 |Bone Marrow 
fPool 


0.2 


0.3 


Ovary 


0.0 


0.0 


Fetal Heart 


0.0 


0.0 • 


Breast ca. 


0.0 


0.0 


Heart Pool 


0.0 


0.0 
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MCF-7 












Breast ca. 
MDA-MB-231 


0.6 


1.0 


Lymph Node Pool 


0.0 


0.0 


Breast ca. BT 
549 


3.2 


0.0 


Fetal Skeletal 
Muscle 


0.0 


0.1 


Breast ca. 
T47D 


0.2 


0.8 


Pool 


0.1 


0.1 


Breast ca. 


0.1 


0.2 


Spleen Pool 


0.1 


0.5 


Breast Pool 


0.1 


0.1 


Thymus Pool 


0.3 


1.1 


Trachea 


4.5 


7.8 


CNS cancer 
(glio/astro) U87- 
MG 


96.6 


94.0 


Lung 


0.1 


02 


CNS cancer 
(glio/astro) U-1 18- 

iVlVJ 


26.6 


33.4 


Fetal Lung 


4.6 


D./ 


CNS cancer 
(neurojmet) SK- 
N-AS 


u.u 


n 1 

U. 1 


Lung ca. NCI- 
N417 


0.0 


0.0 


CNS cancer 
(astro) SF-539 


0.0 


0.0 


Lung ca. LX-1 


2.2 


1.3 


CJNb cancer 
(astro) SNB-75 


0.5 


1.3 


Lung ca. NCI- 
H146 


0.0 


0.0 


CNS cancer (glio) 
SNB-19 


0.0 


0.2 


Lung ca. SHP- 
77 


1.7 


2.1 


CNS cancer (glio) 
SF-295 


6.1 


10.8 


Lung ca. A549 


0.4 


0.3 


Brain (Amygdala) 

jtOOI 


0.0 


0.1 


Lung ca. NCI- 


0.0 


0.0 


Brain (cerebellum) 


0.0 


0.0 


Lung ca. NCI- 


1.2 


2.6 


Brain (fetal) 


0.1 


0.4 


Lung ca. NCI- 
H460 






Brain 

^oippocampus ) 
Pool 


1 A 




Lung ca. HOP- 
62 


2.1 


7.3 


Cerebral Cortex 
Pool 


0-1 


0.2 


Lung ca. NCI- 
H522 


0.0 


0.0 


Brain (Substantia 
nigra) Pool 


0.1 


0.2 


Liver 


0.0 


0.0 


Brain (Thalamus) 
Pool 


0.0 


0.2 


Fetal Liver 


0.0 


0.0 


Brain (whole) 


0.0 


0.1 


Liver ca. 
HepG2 


0.0 


0.0 


Spinal Cord Pool 


0.3 


1.1 


Kidney Pool 


0.1 


0.3 


Adrenal Gland 


0.1 


0.4 


Fetal Kidney 


0.0 


0.1 


Pituitary gl^d 
Pool 


0.0 


0.0 


Renal ca. 786-0 


0.3 


1.1 


Salivary Gland 


0.2 


0.1 
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Renal ca. A498 


0.0 


0.0 


Thyroid (female) 


0.1 


0.3 


Renal ca. 
ACHN 


1.9 


6.3 


Pancreatic ca. 
CAPAN2 


0.3 


0.7 


Renal ca. UO- 
31 


245 


63.7 


Pancreas Pool 


0.4 


0.7 



Table MP. Panel 4.1D 



Tissue Name 


ReL Exp,(%) 
Agl016,Run 
17080646S 


Rel.Exp.(%) 
Agl016,Run 
246331464 


Tissue Name 


Rel. Exp.(%) 
Agl016,Run 
170806468 


Rel. Exp.(%) 
Agl016,Run 
246331464 


Secondary Thl act 


1.2 


1.5 


HUVEC IL-lbeta 


6.8 


8.2 


Secondary Th2 act 


0.2 


0.3 


HUVECIFN 

gamma 


0.2 


0.4 


Secondary Trl act 


0.4 


0.5 


HUVEC TNF alpha! 
+ IFN gamma 


4.8 


A /C 

4.0 


Secondary Thl rest 


0.0 


0.0 

._ . , 


HUVEC TNF alpha 
+ IL4 


2.5 


2.6 


Secondary Th2 rest 


A A 


A A 
U.U 




V/.v 


0 0 


Secondary Trl rest 


0.0 


0.0 


Lung Microvascular 

XjV-' IIUJIC 


0.1 


0.1 


Primary Thl act 


0.0 


0.0 


Lung Microvascular 
EC TNFaIpha+ IL- 
lbeta 


133 


112 


Primary Th2 act 


0.0 


0.0 


Microvascular 
Dermal EC none 


0.2 


0.2 


Primary Trl act 


0.1 


0.1 


Microsvasular 
Dermal EC 
TNFalpha + IL- 

1 Kpf;) 


6.8 


7.7 


Primary Thl rest 


0.0 


0.0 


Rrnnchial 

epithelium 

TNFalphaH- 

ILlbeta 


1 

2.1 


1.0 


Primary Th2 rest 


A A 


A A 
U.U 


Small airway 
epithelium none 


0 9 




Primary Trl rest 


0.0 


0.0 


Small airway 
epithelium 
TNFalpha + IL- 
lbeta 


3.5 


43 


CD45RA CD4 
lymphocyte act 


4.5 


5.1 


Coronery artery 

SMC rest 


11.5 


12.7 


CD45RO CD4 
lymphocyte act 


0.1 


0.1 


Coronery artery 

SMCTNFalpha-K 

IL-lbeta 


19.1 


14.0 


CDS lymphocyte act 


0.0 


0.0 


Astrocytes rest 


0.1 


0.1 


Secondary CDS 
lymphocyte rest 


0.0 


0.0 


Astrocytes 
TNFalpha + BL- 
Ibeta 


2.9 


3.8 


Secondary CDS 


0.0 


0.0 


KU-812 (Basophil) 


0.0 


0.0 
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lymphocyte act 






rest 






CD4 lymphocyte 

none 


0.0 


0.0 


KU-812 (Basophil) 
PMA/ionomycin 


0.5 


0.7 


2ry 

Thl/Th2/Trl_anti- 
rrjo^ PHI 1 


0.0 


0.0 


CCD1106 

(Keratinocytes) j 
none 


0.0 


0.2 


LAK cells rest 


0.2 


0.2 


CCD1106 

(Keratinocytes) 
TNFalpha + IL- 
Ibeta 


0.0 


0.0 


LAK cells IL-2 


0.0 


0.0 


Liver cirrhosis |0. 1 


0.1 


JL;/\XV UClld JJU"'^ ' lJ-» 

12 


02 


0.3 


NCI-H292none jo.l 


0.0 


l^/\3\. L/C lib SM-T- 

2+IFN gamma 


0.1 


0.1 


NCI41292IL-4 !o.O 

i 


0.0 


JL/\IV CCiiS ILt'Zrt^ 1L»- 

IS 


0.1 


0.1 


NCI-H292 IL-9 10.2 


0.1 


LAK. cells 
PMA/ionomycin 


33.2 


41.8 


NCI-H292 IL-13 


0.2 


0.6 


NK Cells IL-2 rest 


0.0 


0.0 


NCI-H292 IFN 
gamma 


0.1 


0.0 


Two Way MLR 3 
day 


1.6 


1.8 


HPAEC none 


2.0 

i 


0.2 


Two Way MLR 5 

day 


1.3 


1.3 


HPAECTNF alpha 
+ IL-1 beta 


18.2 


28J 


Two Way MLR 7 
day 


0.2 


0.2 


Lung fibroblast f ^ ^ 
none | 


0.0 




rt 




Lung fibroblast f 
TNF alpha + IL-1 |25.0 
beta : 


23 8 


PBMC PWM 


52 


8.7 


Lung fibroblast IL- 

4 1 


0.0 


PBMC PHA-L 


5.8 


5.1 


Lung fibroblast IL- L . 

9 1 


0.1 


Ramos (B cell) none 


0.0 


0.0 


Lung fibroblast IL- L ^ 
13 P** 


0.1 


Ramos (B cell) 


0.0 


0.0 


Lung fibroblast IFN 
gamma 


|o.. 


0.4 


B lymphocytes 
PWM 

X ¥V IVl 


0.1 


0.1 


Dermal fibroblast 
CCD1070rest 


OJ 


1.5 


B lymphocytes 
CD40L and IL-4 


0.2 


0.1 


Dermal fibroblast 
CCD1070 TNF 
alpha 


17.3 


19.8 


EOL-1 dbcAMP 


0.6 


0.7 


Dermal fibroblast 
CCD1070 IL-1 beta 


32.1 


37.4 


EOL-1 dbcAMP 
PMA/ionomycin 


7.7 


8.0 


Dermal fibroblast 
IFN gamma 


2.6 


0.4 


E>endritic cells none 


0.0 


0.0 


Dermal fibroblast 
IL-4 


0.4 


1.0 
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uenaniic ecus i_/r o 




9 1 


I>enna] Fibroblasts 
rest 


7 9 




Dendritic cells anti- 
CIMO 


0.0 


0.0 


Neutrophils 
TNFa+LPS 


53.6 


633 


Monocytes rest 


03 


03 


Neutrophils rest 


2.1 


6.2 


Monocytes LPS 


100.0 


100.0 


Colon 


0.0 


0.2 


Macrophages rest 


03 


0.2 


Lung 


03 


0.5 


Macrophages LPS 


33.4 


22.2 


Thymus 


0.2 


0.1 


HUVECnone 


0.0 


0.0 


Kidney 


0.1 


0.1 


HUVEC starved 


0.0 


0.0 




i 
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AI comprehensive panel vLO Summary: Expression of IL-8 is widespread in this panel 
confirmimg the presence of IL-8 in samples related to the autoimmune response. 

General screening panel vL4Summarv: Prominent expression of this gene, an IL-8 
homoolg, on this panel is seen in cancer cell lines, including samples derived from brain, 
lung, colon, renal and melanoma cancers. Because of the published role of IL-8 in mediating 
angiogenesis, combination therapies with experimental or established anti-angiogenic drugs, 
monoclonal antibodies and/or protein therapeutics is anticipated to display synergistic 
efficacy in a clinical setting and be effective in the treatment of these cancers. 

Panel 4. ID Summary: The samples in this panel differ slightly from those in Panel 4D, with 

samples derived from neutrophils present only on Panel 4, ID. Furthermore IL-8 is 

upregulated significantly in TNF-alpha/LPS treated neutrophils when compared to expression 

in resting neutrophils. The expression of IL-8 is also upregulated significantly in the 

following immune-stimulated cell types relative to their resting counterparts: dermal 

fibroblasts treated with IL-1 beta or TNF alpha; microvascular endothelial cells treated with 

IL-1 beta and TNF alpha; lung fibroblasts treated with IL-1 beta and TNF alpha; pulmonary 

artery endothelial cells treated with IL-1 beta and TNF alpha; astrocytes treated with lL-1 

beta and TNF alpha; small airway epithelium treated with IL-I beta and TNF alpha; 

HUVEC's treated with TNF alpha or IL-1 beta; LPS treated macrophages, monocytes and 

dendritic cells, activated eosinophils and peripheral blood mononuclear cells; and finally 

LAK cells stimulated with PMA/ionomycin. The secretion of IL-8 by endothelial cells upon 

stimulation with inflammatory cytokines TNF alpha and IL-l-Beta indicates that IL-8 may be 

involved in the arrest of neutrophils and perhaps monocytes or endothelial cells, as well as 

the subsequent transendothelial migration of these cells. Therefore small molecule 

antagonists or blocking mAbs to IL-8 may be potential therapeutics in acute inflammatory 
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diseases where neutrophils play an important role, such as ischemia reperfusion in the heart, 
intestine and brain, as well as in endotoxic shock and ARDS. Neutrophils are also thought to 
play an important role in one chronic inflammatory disease, emphysema (COPD), which 
could also be treated with IL-8 antagonists. In chronic inflammatory diseases with an immune 
component these antagonists may prevent the trafficking of monocytes to the area of 
inflammation. This would eventually lead to a loss of Antigen Presenting Cells at the site of 
inflammation as monocytes can diflferentiate into dendritic cells. As a result, the immune 
response would be down-regulated and the inflammation subside. It is also known that 
monocytes differentiate into macrophages at the site of inflammation. These cells are a major 
source of inflammatory cytokines such as TNF alpha and ILl beta, which contribute to the 
inflammation. Therefore, blockade of monocyte migration to the site over time will deplete 
macrophages and result in a decrease in the production of pro-inflammatory cytokines at the 
site and as a resuh, the inflammation will decrease. Rheumatoid arthritis, inflammatory bowel 
disease, asthma, atopic dermatitis, psoriasis, and muhiple sclerosis all could be treated with 
IL8 antagonists to antagonize monocyte trafficking. In summary the data shows that IL-8 is 
target for antibody-mediated therapy for multiple inflammatory diseases including psoriasis, 
asthma, allergy, emphysema, stroke, ischemia reperfusion injury, encephalitis, AIDS-related 
dementia and septic shock. 

Based on data provided in panel 1 .3 and panel 2D therapy directed against soluble IL- 
8 is anticipated to have a pronounced impact on the malignant progression of the following 
human tumors; adenocarcinomas of the colon, squamous cell and adenocarcinomas of the 
lung, clear cell renal cell carcinomas, hepatocellular carcinomas, transitional cell carcinomas 
of the bladder, and Cystadenocarcinoma and adenocarcinomas of the stomach, ovarian 
tumors and thyroid tumors. Panel 1.3 also suggests applicability to the treatment of gliomas 
and astrocytomas. Therapy could be applied clinically using a monoclonal antibody immuno- 
specifically recognizing (i.e. binding to, interacting with) IL-8. Such antibody could be 
conjugated to a prodrug-activating enzyme, a radioisotope, or any number of toxins that have 
been applied in pre-clinical animal tumor xenograft models. Therapy might also be applied 
by a tumor homing adenovirus or other viral vector system expressing a "ribozyme" designed 
to specifically target the IL-8 messenger RNA molecule (transcript) for hydrolytic 
degradation. Likewise, modified or unmodified antisense oligonucleotides designed to 
disrupt IL-8 mRNA stability and/or translation, that have been targeted to these tumors by 
various technologies (liposomes, tumor vascular homing peptides, direct intratumoral 
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injection and/or electroporation) would be anticipated to retard or block disease progression. 
Because of the published role of IL-8 in mediating angiogenesis, combination therapies with 
experimental or established anti-angiogenic drugs, monoclonal antibodies and/or protein 
therapeutics is anticipated to display synergistic efficacy in a clinical setting. 

Following physical trauma to the brain and spinal cord, leukocytes are quickly 
recruited to the damaged area and surrounding tissue. Such cells are thought to be involved in 
the instigation and perpetuation of local inflammatory responses (macrophage recruitment, 
infiltration and activation; free radical production) which further exacerbate tissue injury. 
There is evidence that the same mechanisms also operate in stroke, AIDS dementia, 
inflammatory peripheral neuropathies and other conditions of CNS encephalities. 

IL-8 is also a therapeutic target in meningitis where it is involved in leukocyte 
recruitment. 

With respect to demyelination diseases, antibodies to IL-8 may also have therapeutic use in 
multiple sclerosis, cerebral lupus and other demyelinating disorders of the CNS. eentry of 
leukocytes is critical for extracellular proteolysis the development of antibody-producing 
cells that synthesize antibodies against myelin proteins, as well as the recruitment of 
macrophages to plaque sites in the cerebral white matter. (Cuzner and Opdenakker: J. 
NeuroimmunoL, 94:1-14, 1999). 

NOV22 - CG57256-01 and CG57256-02: Protein tyrosine phosphatase 

Expression of the NOV22 genes was assessed using the primer-probe set Ag3272, 
described in Table NA. Resuhs of the RTQ-PCR runs are shown in Tables NB, NC and ND. 

Table NA . Probe Name Ag3272 



Primers 1 


Sequences ^ ILength 


Start Position 


SEQID 
NO: 


Forward|5-tgccctagcatcagttgaag-3' j20 


365 


512 


Probe 


TET-5 -tggaatgaaacatgaagatgcagtaca-3 -TAMRA f 27 


386 


513 


Reverse j5-tttaaaagctccactccgct-3* j20 


427 


514 



Table NB . CNS_neurodegeneration_vl.O 



Tissue Name 


Rel. Exp.(%) Ag3272, Run 
210038590 


Tissue Name 


Rel. Exp.(%) Ag3272, Run 
210038590 


AD 1 Hippo 


0.0 


Control (Path) 3 
Temporal Ctx 


0.0 


AD 2 Hippo 


0.0 


Control (Path) 4 
Temporal Ctx 


0.0 
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AD 3 Hippo 


0.0 


AD 1 Occipital Ctx 


0.0 


^ oippu 




AD 2 Occipital Ctx 
(Missing) 


0.0 


AD 5 Hippo 


0.0 


AD 3 Occipital Ctx 


0.0 


AD 6 Hippo 


0.0 


AD 4 Occipital Ctx 


0.0 


Control 2 Hippo 


0.1 


AD 5 Occipital Ctx 


0.0 


Control 4 Hippo 


0.1 


AD 6 Occipital Ctx 


0.0 


l^OnirOl ^Jrdlli^ J 

Hippo 


0.0 


Cnntrnl 1 Orcinital 

Ctx 


0.0 


AD 1 Temporal Ctx 


0.1 


Ctx 


0.0 


AD 2 Temporal Ctx 


0.1 


Ctx 


0.0 


AD 3 Temporal Ctx 


0.0 


i^oniroi H L/ccipiiai 
Ctx 


0.0 


AD 4 Temporal Ctx 


100.0 


Occipital Ctx 


0.3 


AL) D ml 1 emporai 
Ctx 


0.1 


v^oniroi ^^Jrain^ z 
Occipital Ctx 


0.1 


AD 5 bup 1 emporai 
Ctx 


0.0 


v^ontroi ^rain^ j 
Occipital Ctx 


0.0 


AD 6 Ini Temporal 
Ctx 


0.0 


i^oniroi v^r am ) 4 
Occipital Ctx 


0.0 


AD o oup 1 emporai 
Ctx 


0.0 


Control 1 Parietal Ctx 


0.0 


Control 1 1 emporai 
Ctx 


0.0 


Control 2 Parietal Ctx 


0.1 


Control 2 Temporal 
Ctx 


0.0 


Control 3 Parietal Ctx 


0.0 


C^f\nfrf\\ T**mnrtra1 

Ctx 


0.0 


Control CPath"^ 1 
Parietal Ctx 


0.0 


Control 3 Temporal 
Ctx 


0.0 


Control (Path) 2 
Parietal Ctx 


0.0 


Control (Path) 1 
Temporal Ctx 


0.1 


Control (Path) 3 
Parietal Ctx 


0.0 


Control (Path) 2 
Temporal Ctx 


0.1 


Control (Path) 4 
Parietal Ctx 


0.2 



Table NC . General_screening_panel_vl.4 



Tissue Name 


Rel. Exp.(%) Ag3272, 
Run 215775344 


Tissue Name 


Rel.Exp.(%)Ag3272, 
Run 215775344 


Adipose 


0.0 


Renal ca. TK-10 


0.0 


Melanoma* 
Hs688(A).T 


0.0 


Bladder 


0.0 


Melanoma* 
Hs688(B).T 


0.3 


Gastric ca. (liver met.) 
NCI-N87 


0.1 


Melanoma* Ml 4 


0.0 


Gastric ca. KATO III 


0.0 


Melanoma* 


0.1 


Colon ca. SW-948 


100.0 
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LOXIMVI 








Melanoma* SK-MEL- 

5 


0.0 


Colon ca. SW480 


0.0 


Squamous cell 
carcinoma SCC-4 


u.u 


Colon ca.* (SW480 met) 
SW620 


u.u 


Testis Pool 


0.0 


Colon caHT29 


0.0 


Prostate ca.* (bone 
met) PC-3 


0.0 


Colon ca. HCT-116 


0.0 


Prostate Pool 


0.0 


Colon ca CaCo-2 


0.0 


Placenta 


2.6 


Colon cancer tissue 


0.0 


Uterus Pool 


0.0 


Colon caSWl 116 


0.0 


Ovarian ca. OVCAR-3 


0.0 


Colon ca Colo-205 


0.0 


Ovarian ca. SK-OV-3 


0.0 


Colon ca SW-48 


0.0 


Ovarian ca. OVCAR-4 


0.0 


Colon Pool 


0.0 


Ovarian ca OVCAR-5 


0.0 


Small Intestine Pool 


0.0 


Ovarian ca IGROV~l 


0.0 


Stomach Pool 


0.1 


Ovarian ca OVCAR-8 


0.0 


Bone Marrow Pool 


0.0 


Ovary 


0.0 


Fetal Heart 


0.0 


Breast ca. MCF-7 


0.0 


Heart Pool 


0.0 


Breast ca. MDA-MB- 
231 


0.0 


Lymph Node Pool 


0.1 


Breast ca. BT 549 


0.0 


Fetal Skeletal Muscle 


0.0 


Breast ca T47D 


0.0 


Skeletal Muscle Pool 


0.0 


Breast ca MDA-N 


0.0 


Snlppn Pool 


0.0 


Breast Pool 


0.0 


Thymus Pool 


0.1 


Trachea 


0.0 


CNS cancer (glio/astro) 
U87-MG 


1.9 


Lung 


0.0 


CNS cancer (glio/astro) 

u-ns-MG 


0.1 


Fetal Lung 


0.0 


CNS cancer (neuro;met) 
SK-N-AS 


0.0 


Lungca.NCI-N417 


0.0 


ciNo cancer (^astro^ or- 

539 


0.1 


Lung ca LX-1 


0.0 


CNS cancer Castro'i SNB- 
75 


0.5 


LungcaNCI-H146 


0.0 


CNS cancer (glio) SNB- 
19 


0.0 


Lung ca. oJrlJr- / / 


A 1 


CNS cancer (glio) SF- 
295 


5 A 


Lung ca. A549 


0.0 


Brain (Amygdala) Pool 


0.0 


LungcaNCI-H526 


0.0 


Brain (cerebellum) 


0.0 


LungcaNCI-H23 


0.3 


Brain (fetal) 


0.2 


LungcaNCI-H460 


2.2 


Brain (Hippocampus) 
Pool 


0.0 


LungcaHOP-62 


0.0 


Cerebral Cortex Pool 


0.0 


Lungca.NCI-H522 


0.0 


Brain (Substantia nigra) 
Pool 


0.0 
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Liver 


0.0 


[Brain (Hialamus) Pool 


0.0 


Fetal Liver 


0.0 


{Brain (whole) 


0.0 


Liver ca. HepG2 


0.0 


jSpinal Cord Pool 


0.1 


Kidney Pool 


0.0 


jAdrenal Gland 


0.0 


Fetal Kidney 


0.9 


jPituitary gland Pool 


0.0 


Renal ca. 786-0 


0.0 


jSalivary Gland 


0.0 


Renal ca. A498 


0.0 


|Thyroid (female) 


0.0 


Renal ca. ACHN 


0.1 


jpancreatic ca. CAPAN2 


0.0 


Renal ca. UO-31 


0.0 


jPancreas Pool 


0.0 



Table ND. Panel 4D 



Tissue Name 


Rel. Exp.(%)Ag3272, 
Run 165128063 


Tissue Name 


Rel. Exp.(%)Ag3272, 
Run 165128063 


Secondary Thl act 


0.0 


HUVEC IL-lbeta 


0.0 


oeconuary inz act 




l\rPr^ TT7M crammer 




Secondary Trl act 


0.0 


HUVEC TNF alpha + IFN 
gamma 


0.0 


c^econdary inl rest 


U.U 


riuviii^ iiNr aipnaT^iL/f 


U.U 


Secondary Th2 rest 


0.0 


HUVEC IL-1 1 


0.0 


Secondary Trl rest 


0.0 


Lung Microvascular EC none 


0.0 


Primary Thl act 


0.0 


Lung Microvascular EC 
TNFalpha + IL- 1 beta 


0.0 


Primary Th2 act 


0.0 


Microvascular Dermal EC 
none 


0.0 


Primary Trl act 


0.0 


Microsvasular Dermal EC 
TNFalpha + IL-lbeta 


0.0 


Primary Thl rest 


6.8 


Bronchial epithelium 
TNFalpha -t- ILlbeta 


6.6 


Primary Th2 rest 


0.0 


Small airway epithelium 
none 


0.0 


Primary Trl rest 


0.0 


Small airway epithelium 
TNFalpha + IL-lbeta 


5.3 


CD45RA CD4 
lymphocyte act 


0.0 


Coronery artery SMC rest 


0.0 


CD45RO CD4 
lymphocyte act 


7.8 


Coronery artery SMC 
TNFalpha + IL-1 beta 


0.0 


CDS lymphocyte act 


0.0 


Astrocytes rest 


13.8 


Secondary CDS 
lymphocyte rest 


0.0 


Astrocytes TNFalpha + IL- 
Ibeta 


0.0 


Secondary CDS 
lymphocyte act 


0.0 


KU-812 (Basophil) rest 


0.0 


CD4 lymphocyte none 


0.0 


KU-812 (Basophil) 
PMA/ionomycin 


0.0 


2ryThl/Ili2/Trl anti- 
CD95CH11 


0.0 


CCDl 106 (Keratinocytes) 
none 


0.0 


LAK cells rest 


0.0 


CCDl 106 (Keratinocytes) 
TNFalpha + IL- 1 beta 


0.0 
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LAK cells IL-2 


0.0 


lAv&c cirriiosis 


40.9 


LAK cells IL-2+IL-12 


0.0 


Lupus kidney 


21.3 


LAK cells IL-2+IFN 
gamma 


A A 

0.0 


iNCi-rizyz none 


n A 
u.u 


LAK cells IL-2+IL-18 


0.0 


NCI-H292 IL-4 


0.0 


LAK cells 
PMA/ionomycin 


u.u 




\)A} 


NK Cells IL-2 rest 


0.0 


Rci-H292IL-13 


0.0 


Two Way MLR 3 day 


0.0 


NCI-H292 IFN gamma 


0.0 


Two Way MLR 5 day 


0.0 


HPAEC none 


0.0 


Two Way MLR 7 day 


0.0 


HPAEC TNF alpha + IL-1 
beta 


0.0 


PBMC rest 


0.0 


Lung fibroblast none 


9.5 


PBMC PWM 


7.7 


Lung fibroblast TNF alpha + 
IL-1 beta 


0.0 


PBMC PHA-L 


5.8 


Lung fibroblast IL-4 


0.0 


ivamos \±y ceu j none 


n A 

u.v 


T liner "fiHrAWsict TT -Q 


0 0 


Ramos (B cell) 
lonomycm 


11.6 


Lung fibroblast IL-1 3 


0.0 


B lymphocytes PWM 


0.0 


Lung fibroblast IFN gamma 


0.0 


B lymphocytes CEWOL 


0.0 


Dermal fibroblast CCD] 070 


0.0 


EOL-1 dbcAMP 


0.0 


Derma! fibroblast CCD1070 
TNF alnha 


0.0 


POT -1 HVir AMP 

PMA/ionomycin 


5.9 


Formal fthroblast CCD! 070 

IL-1 beta 


0.0 


Dendritic cells none 


0.0 


Dermal fibroblast IFN 

gamma 


0.0 


Dendritic cells LPS 


4.0 


Dermal fibroblast IL-4 


7.7 


Dendritic cells anti- 
CD40 


0.0 


IBD Colitis 2 


2.9 


Monocytes rest 


0.0 


IBD Crohn's 


0.0 


Monocytes LPS 


0.0 


Colon 


6.4 


Macrophages rest 


0.0 


Lung 


5.8 


Macrophages LPS 


0.0 


Thymus 


100.0 


HUVEC none 


0.0 


Kidney 


0.0 


HUVEC starved 


0.0 







CNS_neurodegeneratioii_vl.O Summary: Ag3272 Expression of the NOV22 gene is 
low/undetectable in all samples on this panel (CTs>35). (Data not shown.) 



General_screeiiiiig_panel_yl.4 Summary: Ag3272 Expression of the NOV22 gene is 
highest in a colon cancer cell line, SW-948 (CT=25.8). Moderate expression is also seen in 
two brain cancer cell lines and a lung cancer cell line. Thus, expression of this gene could be 
used to differentiate between these samples and other samples on this panel and as a maricer 
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to detect the presence of colon cancer. Furthermore, therapeutic modulation of the expression 
or function of this gene may be effective in the treatment of colon cancer. 

In addition, this gene is expressed at much higher levels in fetal kidney (CT=32.6) 
than in adult kidney (CT=38). Thus, expression of this gene could be used to differentiate 
between adult and fetal sources of this tissue. Furthermore, expression of this gene in the fetal 
kidney suggests that this gene product may be involved in the development of this organ. 
Therefore, therapeutic modulation of the expression or function of this gene may be effective 
in the treatment of diseases of the kidney. 

Panel 4D Summary: Ag3272 Significant expression of the NOV22 gene is restricted to the 
thymus (CT=34.1). Thus, the protein encoded by this gene may play an important role in T 
cell development and be a marker for this lymphoid tissue. Small molecule therapeutics, or 
antibody therapeutics designed against the protein encoded by this gene could be utilized to 
modulate immune function (T cell development) and be important for organ transplant, AIDS 
treatment or post chemotherapy immune reconstitution. 

O- NOV27 - CG57228-01: ALDO-KETO REDUCTASE FAMILY 7, MEMBER 
A3 like protein 

Expression of the NOV27 gene was assessed using the primer-probe set Ag3143, 
described in Table OA. Results of the RTQ-PCR runs are shown in Tables OB, OC, OD and 
OE. 

Table OA . Probe Name Ag3 1 43 



PHm.rs|se<,ue„ces 


Length 


Start Position 


SEQID 
NO: 


Forwardp'-ccctgaagcctgacagtgt-3 ' 


19 


308 


515 


Probe |tET-5 -ctgcagtgtcccagagtggacctctt-3 -TAMRA 


26 


358 


|516 


Reverse 1 5 -tgtggtcaggtgcatgtagata-3' 


22 


385 


517 



Table OB . CNS_neurodegeneration_vl.O 



Tissue Name 


ReL Exp.(%) 
209057242 


Ag3143, RunL. 
^ {Tissue Name 


ReL Exp.(%) Ag3143, Run 
209057242 


AD 1 Hippo 


27.4 


jControl (Path) 3 
jTemporal Ctx 


13.5 


AD 2 Hippo 


57.0 


jControI (Path) 4 
{Temporal Ctx 


42.9 


AD 3 Hippo 


20.2 


|aD 1 Occipital Ctx 


30.8 


AD 4 Hippo 


13.3 


|AD 2 Occipital Ctx 


0.0 
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1 


(Missing) 




AD 5 Hippo |83.5 


AD 3 Occipital Ctx 


14.7 


AD 6 Hippo 1853 


AD 4 Occipital Ctx 


29.3 


Control 2 Hippo 


42.9 


AD 5 Occipital Ctx 


41.2 


Control 4 Hippo 


53.2 


AD 6 Occipital Ctx 


21.8 


Control (Path) 3 
Hippo 


17.3 


Control 1 Occlnital 
Ctx 


6.8 


AD 1 Temooral Ctx 


34.6 


Ctx 


64.6 


AO 2 Temnoral Ctx 


42.6 


Ctx 


30.8 


AD ^ Tf»mnoral Ctx 


22.4 


Ctx 


28.9 






^^oniTOi \^Jrain^ i 
Occipital Ctx 


100.0 


AD 5 Inf Temporal 

Ctx 




Occipital Ctx 


18.9 


AD 5 Sup Temporal 
Ctx 


1% ^ 

/ O.J 


L^onuoi (^r ain ) 5 
Occipital Ctx 


5.9 


AD 6 Inf Temporal 

Ctx 




control i^ratnj 4 
Occipital Ctx 


23.5 


AD 6 Sup Temporal 
Ctx 




Control 1 Parietal Ctx 


15.4 


Control 1 Temporal 
Ctx 


17.6 


Control 2 Parietal Ctx 


66.4 


Control 2 Temporal 
Ctx 


49.3 


Control 3 Parietal Ctx 


33.0 


Control 3 Temporal 
Ctx 


23.8 


Parietal Ctx 


60.7 


Control 3 Temporal 
Ctx 


26.1 


Control (Path) 2 
Parietal Ctx 


27.5 


Control (Path) 1 
Temporal Ctx 


64.6 


Control (Path) 3 
Parietal Ctx 


8.3 


Control (Path) 2 
Temporal Ctx 


44.1 


Control (Path) 4 
Parietal Ctx 


33.4 



Table PC. Panel 1.3D 



Tissue Name 


ReLExp.(%)Ag3143, 
Run 167994819 


Tissue Name 


ReLExp.(%)Ag3143, 
Run 167994819 


Liver adenocarcinoma 


18.9 


Kidney (fetal) 


26.2 


Pancreas 


30.6 


Renal ca. 786-0 


2.4 


Pancreatic ca. CAPAN 2 


6.4 


Renal ca. A498 


8.5 


Adrenal gland 


4.7 


Renal ca.RXF 393 


27.2 


Thyroid 


3.8 


Renal ca. ACHN 


8.5 


Salivary gland 


1.6 


Renal ca. UO-31 


7.0 


Pituitary gland 


8.5 


Renal ca. TK-10 


5.5 


Brain (fetal) 


15.3 


Liver 


66.0 
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Brain (whole) 


133 


Liver (fetal) 


4.5 


Brain (amygdala) 


6.3 


Liver ca. (hepatoblast) 
HepG2 


19.1 


Brain (cerebellum) 


19.6 


Lung 


3.0 


Brain (hippocampus) 


14.5 E 


Lung (fetal) 


8.8 


Brain (substantia nigra) 


12.6 


LX-1 


18.0 


Brain (thalamus) 


5.4 


T imo" /^ci ( ctniill r**^n^ 

NCI-H69 


3.3 


Cerebral Cortex 


19.5 


J^Ung Co. ^2^.C'CiI VdT.y 

SHP-77 


100.0 


Spinal cord 


9.5 


i^Ung wa. ^loXgC 

cell)NCI-H460 


3.1 

\ . 1 1 ,. 


glio/astro U87-MG 


15.0 


Lung ca. (non-sm. cell) 
A549 


16.3 


glio/astroU-118-MG 


1.9 


jLung ca. ^^non-s.ceu y 
NCI-H23 


10.6 


astrocytoma SW1783 


20.4 


Lung ca. (non-s.cell) 
HOP-62 


0.0 


neuro*; met SK-N-AS 


14.5 


Lung ca. (non-s.cl) 
NCI-H522 


14.8 


astrocytoma SF-539 


5.8 


Lung ca. (squam.) SW 

1900 


4.9 


astrocytoma SNB-75 


17.0 


Lung ca. (squam.) 
NCI-H596 


4.6 


glioma SNB-19 


3.4 


Mammary gland 


14.7 


glioma U251 


11.6 


Breast ca.* (pLef) 
MCF-7 


ilO.9 

\ 


gnoma br-2y5 


9.0 


Breast ca.* (pl.ef) 
MDA-MB-231 


i7.9 


Heart (fetal) 


52.1 


Breast ca.* (pl.ef) 
T47D 


40.9 


Heart 


2.4 


Breast ca. BT-549 


5.7 


Skeletal muscle (fetal) 


16.5 


Breast ca. MDA-N 


7.5 


Skeletal muscle 


8.8 


Ovary 


14.9 


Bone marrow 


1.3 


Ovarian ca. OVCAR-3 


7.8 


Thymus 


5.1 


Ovarian ca. OVCAR-4 


9.5 


Spleen 


3.0 


Ovarian ca. OVCAR-5 


29.7 - 


Lymph node 


2.8 


Ovarian ca OVCAR-8 


2.4 


Colorectal 


15.8 


Ovarian ca. IGROV-1 


3.8 


Stomach 


32.3 


Ovarian ca* (ascites) 
SK-OV-3 


37.9 


Small intestine 


29.9 


Uterus 


4.2 


Colon ca. SW480 


8.8 


Placenta 


1.0 


Colon ca.* 
SW620(SW480 met) 


41.2 


Prostate 


4.2 


Colon ca. HT29 


29.3 


Prostate ca.* (bone 


7.7 
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met)PC-3 




Colon ca. HCT-116 


8.5 


Testis 


3.4 


Colon ca. CaCo-2 


14.9 


Melanoma Hs688(A).T 


10.4 


Colon ca. 
tissue(OD03866) 


9.2 


Melanoma* (met) 
Hs688(B).T 


2^ 


Colon ca. HCC-2998 


26,6 


Melanoma UACC-62 


5.8 


Gastric ca.* (liver met) 
NCI-N87 


9.5 


Melanoma M14 


5.2 


Bladder 


20.2 


Melanoma LOXIM VI 


63 


Trachea 


1.8 


Melanoma* (met) SK- 
MEL-5 


63 


Kidney 


59.9 


Adipose 


43 



Table OP. Panel 4D 



Tissue Name 


ReLExp.(%)Ag3143, 
Run 164527996 


1 issue iName 


Rel.Exp.(%)Ag3143, 
Run 164527996 


Secondary Thl act 


3.2 


HUVECIL-lbeta 


1.5 


Secondary Th2 act 


2.7 


HUVEC IFN gamma 


3.6 


Secondary Trl act 


1.7 


HUVEC TNF alpha + IFN 
gamma 


3.0 


Secondary Thl rest 


1.2 


HUVEC TNF alpha + IL4 


5.8 


Secondary Th2 rest 


1.6 


HUVEC IL- 11 


4.3 


Secondary Trl rest 


2.3 


Lung Microvascular EC none 


4.5 


Primary Thl act 


53 


TNFalpha+IL-lbeta 


5.7 


Primary Th2 act 


4.9 


Microvascular Dermal EC 
none 


6.2 


Primary Trl act 


6.3 


Microsvasular Dermal EC 
TNFalpha + IL-lbeta 


5.4 


Primary Thl rest 


8.2 


Bronchial epithelium 
TNFalpha + ILlbeta 


2.9 


Primary Th2 rest 


5.9 


Small airway epithelium 
none 


4.2 


Primary Trl rest 


6.5 


Small airway epithelium 
TNFalpha + IL-lbeta 


11.0 


CD45RA CD4 
lymphocyte act 


5.8 


Coronery artery SMC rest 


8.2 


CD45ROCD4 
lymphocyte act 


5.7 


Coronery artery SMC 
TNFalpha + IL-lbeta 


4.8 


CD8 lymphocyte act 


7.3 


Astrocytes rest 


4.3 


Secondary CD8 
lymphocyte rest 


5.4 


Astrocytes TNFalpha + IL- 

Ibeta 


3.1 


Secondary CD8 
lymphocyte act 


6.3 


KU-812 (Basophil) rest 


7.4 


CD4 lymphocyte none 


2.6 


KU-812 (Basophil) 
PMA/ionomycin 


6.3 


2ry Thl/Ili2/Trl_anti- 


5.5 


CCDl 106 (Keratinocytes) 


4.4 
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CD95CH11 


Inone 




LAK cells rest 


ICCDl 1 06 (Keratinocytes) 
^^'^ fTNFalpha + IL-lbeta 


1.5 


LAK cells lL-2 


1 0.3 |Liver cirrhosis 


18.0 


LAK cells IL-2+IL-12 


6.3 


Lupus kidney 


2.8 


LAK cells IL-2+IFN 
gamma 


7.1 


NCI-H292none 




LAK cells IL-2+IL-18 


5.9 


NCI-H292 IL-4 


11.0 


LAK cells 
PMA/ionomycjn 


2.9 


NCI-H292 IL-9 


1 1 A 


NK Cells IL-2 rest 


2.9 


NCI-H2921L-13 


63 


Two Way MLR 3 day 


4.3 


NCI-H292 IFN gamma 


7.1 


Two Way MLR 5 day 


4.7 


HPAEC none 


3.8 


Two Way MLR 7 day 


3.8 


XJT> A "C/^ TKT17 nlw^tiA -1- IT 1 

HrAiiL. 1 Nr alpna + IJb-i 
beta 


3.7 


PBMC rest 


s.z 


Lung fibroblast none 


2.9 


PBMCPWM 


11.8 


Lung fibroblast TNF alpha + 
iL-i beta 


1.6 


PBMC PHA-L 


7.7 


Lung fibroblast IL-4 


7.1 




6.5 


Lung fibroblast lL-9 


4.6 


Ramos (B cell) 


31.2 


Lung fibroblast IL-13 


5.8 


B lymphocytes PWM 


18.3 


Lung fibroblast IFN gamma 


5.8 


B lymphocytes CD40L 
and TT -4 


17.1 


Dermal fibroblast CCD1070 

rest 


12.9 


EOL-1 dbcAMP 


12.8 


lyermai iiDroDJasi k^k^ljiK) /\j 
TNF alpha 


12.9 


EOL-1 dbcAMP 

PMA/ionomycin 


4.6 


Derma! fibroblast CCD1070 
IL-1 beta 


3.9 


Dendritic cells none 


4,7 


Dermal fibroblast IFN 

gamma 




Dendritic cells LPS 


2,1 


Dermal fibroblast IL-4 


15.2 


Dendritic cells anti- 
CD40 


4.8 


IBD Colitis 2 


0.8 


Monocytes rest 


3.5 


IBD Crohn^s 


28.9 


Monocj^es LPS 


0.4 


Colon 


98.6 


Macrophages rest 


10.8 


Lung 


6.3 


Macrophages LPS 


3.1 


Thymus 


100.0 


HUVEC none 


5.8 


Kidney 


13.1 


HUVEC starved 


12.8 







Table OE . Panel 5 Islet 



Tissue Name 


Rel. Exp.(%) 

Ag3143,Run 

233698023 


Tissue Name 


Rel. Exp.(%) 
Ag3143, Run 

233698023 


97457„Patient- 
02go_adipose 


23.7 


94709_Donor 2 AM - A_adipose 


63.3 
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97476_Patient- 
07<;k "skeletal muscle 


1K2 


94710_Donor 2 AM - B_adipose 


26.1 


97477_Patient- 
07iit litems 


18.4 


9471 IJDonor 2 AM - C_adipose 


19.2 


97478_Patient- 
07nl niacetita 


13.1 


94712_Donor 2 AD - A_adipose 


54.0 


991 67_Bayer Patient 1 


72.7 


947 1 SJDonor 2 AD - B_adipose 


71.7 


97482_Patient- 
08ut uteras 


10.0 


94714_Donor 2 AD - C_adipose 


77.9 


97483_Patient- 
08pl_j>lacenta 


11.0 


y4 /4z__L)onor 3 U - 
A_Mesenchymal Stem Cells 


7.3 


97486_Patient- 
09sk skeletal muscle 


8.1 


y4 / 43_JL>onor 3 u ~ 
BJMesenchymal Stem Cells 


6.3 


97487_Patient- 
09ut uterus 


11.3 


94730_Donor 3 AM - A_adipose 


10.6 


97488_Patient- jj^ g 
09pl_placenta I 


94731_Donor 3 AM - Bjadipose 


6.5 


97492JPatien^ i^^ ^ 
10ut_uterus j 


94732JDonor 3 AM - C_adipose 


6.3 


97493_Patient- f^^ ^ 
10pl_placenta j 


94733_E>onor 3 AD - A_adipose 


11.0 


97495_Patient- |g ^ 
llgo_adipose 1 


94734__Donor 3 AD - Bjadipose 


7.0 


97496_Patient- 

1 Isk skeletal muscle 


il4.7 


94735JDonor 3 AD - C_adipose 


3.4 


97497_Patient- 
llut_jiterus 


I22.4 


77138_Liver_HepG2untreated 


46.3 


97498JPatient- ^ 
llpl_placenta 1 


7355o_HeartjLardiac stromal cells 
(primary) 


6.2 


97500_Patient- f^^ ^ 
12go_adipose | 


81735_Small Intestine 


76.8 


97501_Patient- f^^ ^ 
12sk_skeletai muscle j 


72409_Kidney_ProximaI 
Convoluted Tubule 


9.2 


97502JPatient- |^ ^ 
12ut_uterus | 


82685_Small intestine Duodenum 


100.0 


97503 J^atierrt- ^ 
12pl_placenta | 


90650_Adrenal_Adrenocortical 
adenoma 


6.2 


94721J3onor2U- j 
A_MesenchymaI Stem 145.4 
Cells 1 


724 1 0_Kidney_HRCE 


27.0 


94722JDonor2U- ] 
B_Mesenchymal Stem p9.8 
Cells 1 


72411_Kidney_HRE 


8.7 


94723JDonor2U> } 
C_Mesenchymal Stem 164.2 
Cells f 


73139_Uterus_Uterine smooth 
miiscle cells 


15.9 



CNS_neurodegeneration_vl.O Summary: Ag3143 This panel does not show differential 
expression of the NOV27 gene in Alzheimer's disease. However, this expression profile 



confirms the presence of this gene in the brain. Please see Panel 1 .3D for discussion of utility 
of this gene in the central nervous system. 

Panel 1.3D Summary: Ag3413 The NOV27 gene is expressed at a low level in most of the 
cancer cell lines and normal tissues on this panel. There appears to be significantly higher 
expression in lung, breast and ovarian cancer cell lines. Thus, therapeutic inhibition of this 
gene product, through the use of small molecule drugs, might be of utility in the treatment of 
the above listed cancer types. 

Among metabolic tissues, this gene has low levels of expression (CT values = 31-34) 
in pancreas, pituitary, skeletal muscle and liver. This aldoketoreductase may be a small 
molecule target for the treatment of endocrine and metabolic disease, including Types 1 and 2 
diabetes and obesity. In addition, this gene appears to be differentially expressed in fetal (CT 
value = 32) vs adult heart (CT value = 36) and may be useful for the identification of the fetal 
phenotype in this tissue. It also appears to be differentially expressed in adult (CT value = 31) 
vs fetal liver (CT value = 35) and may also be useful for the identification of the adult 
phenotype in this tissue. 

In addition, low expression throughout the brain suggests a role for this gene in CNS 
processes. Members of the aldo-keto reductase superfamily are known to function in the 
processing of hormones in the brain. Brain hormone regulation mediates numerous clinically 
significant conditions, including psychiatric disorders such as anxiety, overeating and 
memory disorders. Therefore, agents that modulate the activity of this gene product have 
potential utility in the treatment of these disorders. 

References: 

Penning TM, Burczynski ME, Jez JM, Hung CF, Lin HK, Ma Moore M, Palackal 
N, Ratnam K. Human 3alpha-hydroxysteroid dehydrogenase isoforms (AKR1C1-AKR1C4) 
of the aldo-keto reductase superfamily: functional plasticity and tissue distribution reveals 
roles in the inactivation and formation of male and female sex hormones. Biochem J 2000 
Octl;351(Ptl):67-77 

The kinetic parameters, steroid substrate specificity and identities of reaction products 
were determined for four homogeneous recombinant human 3alpha-hydroxysteroid 
dehydrogenase (3aIpha-HSD) isoforms of the aldo-keto reductase (AKR) superfamily. The 
enzymes correspond to type 1 3alpha-HSD (AKR1C4), type 2 3alpha(17beta)-HSD 
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(AKR1C3), type 3 3alpha-HSD (AKR1C2) and 20alpha(3alpha)-HSD (AKRICI), and share 
at least 84% amino acid sequence identity. All enzymes acted as NAD(P)(H)-dependent 3-, 
17- and 20-ketosteroid reductases and as 3alpha-, ITbeta- and 20alpha-hydroxysteroid 
oxidases. The functional plasticity of these isofonns highlights their ability to modulate the 
levels of active androgens, oestrogens and progestins. Salient features were that AKR1C4 
was the most catalytically efficient, with k(cat)/K(m) values for substrates that exceeded 
those obtained with other isoforms by 10-30-fold* In the reduction direction, all isoforms 
inactivated Salpha-dihydrotestosterone (17beta-hydroxy-5alpha-androstan-3-one; Salpha- 
DHT) to yield 5alpha-androstane-3alpha,17beta-diol (Balpha-androstanediol). However, only 
AKR1C3 reduced Delta(4)-androstene-3,17-dione to produce significant amounts of 
testosterone. All isoforms reduced oestrone to 17beta-oestradiol, and progesterone to 
20alpha-hydroxy-pregn-4-ene-3,20-dione (20alpha-hydroxyprogesterone). In the oxidation 
direction, only AKR1C2 converted 3alpha-androstanediol to the active hormone 5alpha- 
DHT. AKR1C3 and AKR1C4 oxidized testosterone to Delta(4)-androstene-3,17-dione. All 
isoforms oxidized 17beta-oestradiol to oestrone, and 20alpha-hydroxyprogesterone to 
progesterone. Discrete tissue distribution of these AKRIC enzymes was observed using 
isoform-specific reverse transcriptase-PCR. AKR1C4 was virtually liver-specific and its high 
k(cat)/K(m) allows this enzyme to form 5alpha/5beta-tetrahydrosteroids robustly. AKR1C3 
was most prominent in the prostate and mammary glands. The ability of AKR1C3 to 
interconvert testosterone with DeIta(4)-androstene-3,17-dione, but to inactivate 5alpha-DHT, 
is consistent with this enzyme eliminating active androgens from the prostate. In the 
mammary gland, AKR1C3 will convert DeIta(4)-androstene-3,17-dione to testosterone (a 
substrate aromatizable to 17beta-oestradiol), oestrone to ITbeta-oestradiol, and progesterone 
to 20alpha-hydroxyprogesterone, and this concerted reductive activity may yield a pro- 
oesterogenic state. AKR1C3 is also the dominant form in the uterus and is responsible for the 
synthesis of 3alpha-androstanedio! which has been implicated as a parturition hormone. The 
major isoforms in the brain, capable of synthesizing anxiolytic steroids, are AKRICI and 
AKR1C2. These studies are in stark contrast with those in rat where only a single AKR with 
positional- and stereo-specificity for 3alpha-hydroxysteroids exists, [egunther, 29-Jan-02] 

Panel 4D Summary: Ag3 143 The NOV27 gene is expressed at high to moderate levels in a 
wide range of cell types of significance in the immune response and tissue response in health 
and disease, with the highest expression being detected colon and thymus (CT=28.1). 
Therefore, targeting of this gene product with a small molecule drug or antibody therapeutic 
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may modulate the functions of cells of the immune system as well as resident tissue cells and 
lead to improvement of the symptoms of patients suffering from autoimmune and 
inflammatory diseases such as COPD, emphysema, asthma, allergies, inflammatory bowel 
disease, lupus erythematosus, and arthritis, including osteoarthritis and rheumatoid arthritis 

Panel 5 Islet Summary: Ag3143 The NOV27 gene has low levels of expression in adipose, 
skeletal muscle and Islets of Langerhans. It is also expressed at low levels in mesenchymal 
stem cells that can be differentiated in vitro into adipocytes, chondrocytes and osteocytes. 
Therefore, this gene product may a small molecule target for the treatment of diseases of 
bone and cartilage and adipose. 

P. NOV25 - CG57276-01: ENDOLYN PRECURSOR-like protein 

Expression of the NOV25 gene was assessed using the primer-probe set Ag3 149, 
described in Table PA. 

Table PA . Probe Name AgS 1 49 



Primers 


Sequences 


Length 


Start Position 


SEQID 
NO: 


Forward 


5»-ccccttctacaacttccaagac-3* 


22 


468 


518 


Probe 


TET-5-caacaaataacactgtgactccaacctca-3-TAMRA 


29 


507 


519 


Reverse 


5-aaggtagactttcgcacaggtt-3* 


22 


537 


520 



CNS_neurodegeneration_vl.O Summary: Ag3149 Expression of the NOV25 gene is 
low/undetectable in all samples on this panel (CTs>35). (Data not shown.) 

Panel 13J> Summary: Ag3149 Expression of the NOV25 gene is low/undetectable in all 
samples on this panel (CTs>35). (Data not shown.) 

Panel 4D Summary: Ag3149 Expression of the NOV25 gene is low/undetectable in all 
samples on this panel (CTs>35). (Data not shown.) 

Q. NOV26 - CG57224-01: ARYLACETAMTOE DEACETYLASE 

Expression of the NOV26 gene was assessed using the primer-probe set Ag3136, 
described in Table QA. Results of the RTQ-PCR runs are shown in Tables QB and QC. 

Table OA . Probe Name Ag3 1 36 



Primers 



Sequences 



ISEO ID 

jLengtfaiStart Position|^Q5 
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Forward 


5'-cccagmccactcactcatta-3* |22 |203 


521 


Probe 


TET-5'-acagtgctcttggccctgcatgt-3'-TAMRA|23 |226 


522 


Reverse 


5-acaggatatagaccccaaatgg-3' [22 |259 


523 



Table QB . Panel 1 .30 



Tissue Name 




Rel.Exp.(%)Ag3136, 
Run 167994424 


Liver adenocarcinoma 


11.6 J 


Kidney (fetal) 


24.7 


Pancreas 


2.8 1 


Renal ca. 786-0 


3.7 


Pancreatic ca. CAPAN 2 


15.6 1 


Renal ca. A498 


5.9 


Adrenal gland 


1.4 iRenal ca. RXF 393 


0.7 


Thyroid 


5.0 1 


Renal ca. ACHN 




Salivary gland 


4.3 ! 


Renal ca. U031 


1.3 


Pituitary gland 


1.4 


Renal ca. TK-10 


3.8 


Brain (fetal) 


0.7 


Liver 


4.5 


Brain (whole) 


1.7 


Liver (fetal) 


2.8 


Brain (amygdala) 


0.5 


Liver ca. (hepatoblast) 
HepG2 


X\J.\j 


Brain (cerebellum) 


0.6 


Lung 


6.2 


Brain (hippocampus) 


0.7 \ 


Lung (fetal) 


7.2 


Brain (substantia nigra) 


^ _ iLung ca. (small cell) 
iLX-1 


4.8 


Brmn (thalamus) 


1.4 


Lung ca. (small cell) 
NCI-H69 


3.7 


Cerebral Cortex 


0.8 


Lung ca. (s.cell var.) 
SHP-77 


1.6 


Spinal cord 


iLung ca. (large 
|cen)NCI-H460 


3.1 


glio/astroU87-MG 


7.5 


Lung ca (non-sm. cell) 
A549 


18.0 


glio/astroU-118-MG 


0.0 


Lung ca (non-s.cell) 
NCI-H23 


1.4 


astrocytoma SW1783 


1.6 


Lung ca (non-s.cell) 
HOP-62 


3.9 


neuro*; met SK-N-AS 


0.0 


Lung ca (non-s.ci) 
NCI-H522 


0.0 


astrocytoma SF-539 


0.0 


iLung ca. (squam.) SW 
|900 


63.3 


astrocytoma SNB-75 


32.8 


Lung ca (squam.) 

!nC1-H596 

i _ ^ . — 


2.5 


glioma SNB-19 


1.6 


Mammary gland 


10.8 


glioma U251 


19.2 


Breast ca.* (pl.ef) 
MCF-7 


03 


glioma SF-295 


13.9 


Breast ca.* (pl.ef) 
MDA-MB-231 


0.0 


Heart (fetal) 


2.1 


Breast ca.* (pl.ef) 
T47D 


3.5 
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Heart 


3.1 


Breast ca.BT-549 


2.5 


Skeletal muscle (fetal) 


7.0 


Breast ca. MDA-N 


1.0 


Skeletal muscle 


1.0 


Ovary 


0.6 


Bone marrow 


4.0 


Ovarian ca OVCAR-3 


0.9 


Thymus 


3.0 


Ovarian ca. OVCAR-4 


0.0 


Spleen 


1.6 


Ovarian ca. OVCAR-5 


100.0 


Lymph node 


6.4 


Ovarian ca OVCAR-8 


0.6 


Colorectal 


0.8 


Ovarian ca IGROV-1 


0.4 


Stomach 


5.1 


Ovarian ca* (ascites) 
SK-OV-3 


13 


Small intestine 


4.1 


Uterus 


5.7 


Colon ca. SW480 


0.7 


Placenta 


1.7 


Colon ca^* 
SW620(SW480met) 


2.0 


Prostate 


12.4 


Colon ca. HT29 


3.5 


Prostate ca.* (bone 
met)PC-3 


4.4 


Colon ca. HCT-116 


3.1 


Testis 


03 


Colon ca. CaCo-2 


63 


Melanoma Hs688(A).T 


3.2 


tissue(OD03866) 


0.0 


Twfelanmiria* f^met^ 

Hs688(B).T 


1.0 


Colon ca.HCC-2998 


2.5 


Melanoma UACC-62 


0.0 


Gastric ca.* (liver met) 
NCI-N87 


21.6 


Melanoma Ml 4 


2.4 

i 

s 


Bladd^ 


11.5 


Melanoma LOXIMVI 


5.8 


Trachea 


4.6 


Melanoma* (met) SK- 
MEL-5 


0.0 


Kidney 


8.4 


Adipose 


4.2 



Table QC . Panel 4D 



Tissue Name 


ReLExp.(%)Ag3136, 
Run 164527948 


Tissue Name 


Rel. Exp.(%)Ag3136, 
Run 164527948 


Secondary Thl act 


4.1 


HUVEC IL-lbeta 


1.8 


Secondary Th2 act 


11.0 


HUVEC IFN gamma 


8.8 


Secondary Trl act 


153 


HUVEC TNF alpha -f IFN 
gamma 


3.5 


Secondary Thl rest 


4.8 


HUVEC TNF alpha + IL4 


1.3 


Secondary Th2 rest 


10.1 


HUVEC IL- 11 


9.2 


Secondary Trl rest 


7.9 


Lung Microvascular EC none 


3.6 


Primary Thl act 


32.1 


Lung Microvascular EC 
TNFalpha + IL-lbeta 


7.4 


Primary Th2 act 


22.5 


Microvascular Dermal EC 
none 


15.2 


Primary Trl act 


40.1 


Microsvasular Dermal EC 
TNFalpha + IL-lbeta 


6.9 


Primary Thl rest 


97.3 


Bronchial epithelium 
TNFalpha + lLlbeta 


3.4 
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Primary Th2 rest 


41.5 


OJllCUl All Wajr Ul^lAUlIt 

none 


1.8 


Primary Trl rest 


54.7 


Small airway epithelium 
TNFalpha+IL-lbeta 


6.3 


CD45RA cm 
lymphocyte act 


12.1 


Coronery artery SMC rest 


30.1 


CD45ROCD4 
lymphocyte act 


13.6 


Coronery artery SMC 
TNFalpha + IL-lbeta 


4.6 


CDS lymphocyte act 


20.4 


Astrocytes rest 


6.0 


oeconuary k^uo 
lymphocyte rest 


4.9 


Ibeta 


0.0 


oeconaary k^l/o 
lymphocyte act 


7.9 


KU-812 (Basophil) rest 


24.8 


CD4 lymphocyte none 


37.4 


KU-8 12 (Basophil) 
PMA/ionomycin 


80.7 


2ryThl/Th2/Trl anti- 
CD95CH11 


13.6 


CCDl 106 (Keratinocytes) 
none 


1.7 


LAK cells rest 


14.1 


CCDl 106 (Keratinocytes) 
TNFalpha-HlL-lbeta 


0.0 


LAK cells IL-2 


13.0 


Liver cirrhosis 


11.4 


LAK cells IL-2+IL-12 


16.2 


Lupus kidney 


2.6 


LAK cells IL-2+IFN 

gamma 




NCl-rlzyz none 


1 AA A 


LAK cells IL-2+ IL-18 


39.8 


NCI-H292 lL-4 


57.0 


LAK cells 

PMA/lonomycin 


T A 
/.U 


JNCl-rizyz iJL~y 


A 


NK Cells IL-2 rest 


49.3 


NCI-H292 IL-13 


28.3 


Two Way MLR 3 day 


13.8 


NCI-H292IFN gamma 


35.1 


Two Way MLR 5 day 


13.1 


HPAEC none 


5.8 


Two Way MLR 7 day 


9.8 


HPAEC TNF alpha + IL-1 
beta 


2.0 


PBMC rest 


13.7 


Lung fibroblast none 


0.7 


PBMCPWM 


463 


Lung fibroblast TNF alpha + 
IL-1 beta 


0.0 


PBMC PHA-L 


27.7 


Lung fibroblast IL-4 


0.9 


Ramos (B cell) none 


0.0 


Lung fibroblast IL-9 


0.0 


Ramos (B cell) 
ionomycin 


0.0 


Lung fibroblast IL-13 


0.0 


B lymphocytes PWM 


22.1 


Lung fibroblast IFN gamma 


0.6 


D lympnocj^cs v^l/^uLf 
andIL4 


8.7 


Udillal liPrOuiaSI \^\^LfiVf\J 

rest 


6.0 


EOL-1 dbcAMP 


0.0 


Dermal fibroblast CCDl 070 
TNF alpha 


14.0 


EOL-1 dbcAMP 
PMA/ionomycin 


3.7 


Dermal fibroblast CCDl 070 
IL-1 beta 


73 


Dendritic cells none 


7.5 


Dermal fibroblast IFN 

gamma 


1.3 


Dendritic cells LPS 


9_5 


Dermal fibroblast IL-4 


6.0 
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Dendritic cells anti- 
CJJ4u 


18.7 


IBD Colitis 2 


0.6 


Monocytes rest 


0.0 


IBD Crohn's 


4.0 


Monocytes LPS 


0.5 


Colon 


26.6 


Macrophages rest 


27.0 


Lung 


9.5 


Macrophages LPS 


6.3 


Thymus 


82.4 


HUVEC none 


10.4 


Kidney 


27.2 


HUVEC starved 


10.9 







Panel 13D Summary: Ag3136 Expression of the NOV26 gene is widespread throughout 
this panel, with highest expression in an ovarian cancer ceil line (CT=31). Significant levels 
of expression are also seen in cell lines derived from lung, gastric, and brain cancers. Thus, 
expression of the NOV26 gene could be used to differentiate between these samples and 
other samples on this panel and as a marker to detect the presence of these cancers. 
Furthermore, therapeutic modulation of the expression or function of this gene may be 
effective in the treatment of lung, gastric, and brain cancers. 

Panel 4D Summary: Ag3136 Highest expression of the NOV26 gene is seen in untreated 
NCI-H292 cells (CT=29). Significant levels of expression are also seen in a cluster of treated 
cells derived fi-om the NCI-H292 cells, a human airway epithelial cell line that produces 
mucins. Mucus overproduction is an important feature of bronchial asthma and chronic 
obstructive pulmonary disease samples. The NOV26 transcript is also expressed at lower but 
still significant levels in small airway epithelium treated with IL-1 beta and TNF-alpha. The 
expression of the transcript in this mucoepidermoid cell line that is often used as a model for 
airway epithelium (NCI-H292 cells) suggests that this transcript may be important in the 
proliferation or activation of airway epithelium. Therefore, therapeutics designed with the 
protein encoded by the transcript may reduce or eliminate symptoms caused by inflammation 
in Iimg epithelia in chronic obstructive pulmonary disease, asthma, allergy, and emphysema. 

In addition, this transcript is induced in the PMA and ionomycin treated basophil cell 
line KU-812. Basophils release histamines and other biological modifiers in response to 
allergens and play an important role in the pathology of asthma and hypersensitivity 
reactions. Therefore, therapeutics designed against the putative protein encoded by this gene 
may reduce or inhibit inflammation by blocking basophil function in these diseases. In 
addition, these cells are a reasonable model for the inflammatory cells that take part in 
various inflammatory lung and bowel diseases, such as asthma, Crohn's disease, and 
ulcerative colitis. Therefore, therapeutics tfiat modulate the function of this gene product may 
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reduce or eliminate the symptoms of patients suffering from asthma, Crohn's disease, and 
ulcerative colitis. 

R. NOV28 - CG57213-01: PB39 

Expression of the NOV28 gene was assessed using the primer-probe set Ag4870, 
described in Table RA. Results of the RTQ-PCR runs are shown in Table RB. 



Table RA . Probe Name Ag4870 



Primers 


Sequences iLengthiStart Position 


SEQIDNO: 


Forward 


5'-gcctgccttatctttctgaact-3' |22 |599 


524 


Probe 


TET-5*-ctttcctgcccctgaggaagtcaatt-3-TAMRA|26 |646 


525 


Reverse 


5^-cactcagcttgatcttcttcgt-3' |22 |674 


526 



Table RB . General_screening_panel_vl.5 



Tissue Name 


K.ei. i^xp*\yo) Ag-'fo/U, 
Run 228903631 


Tissue Name 


1? *»1 t7v*>» ^0A\ A rrA C7A 

Jtvei. JDXp.^^yo^ -A.g'fo /U, 

Run 228903631 


Adipose 


2.0 


Renal ca. TK-10 


31.4 


Melanoma* 
Hs688(A).T 


6.0 


Bladder 


26.6 


Melanoma* 
Hs6S8(B).T 


4.2 


Gastric ca. (liver met.) 
NCI-N87 


7.4 


Melanoma* Ml 4 


20.4 


Gastric ca. KATO III 


66.0 


Melanoma* 
LOXIMVI 


0.9 


Colon ca. SW-948 


10.2 


Melanoma* SK-MEL- 
5 


7.8 


Colon ca.SW480 


16.8 


Squamous cell 
carcinoma SCC-4 


0.5 


Colon ca.* (SW480 met) 
SW620 


63.7 


Testis Pool 


1.3 


Colon ca. HT29 


17.1 


Prostate ca.* (bone 
met) PC-3 


24.7 


Colon ca. HCT-116 


4.4 


Prostate Pool 


4.0 


Colon ca. CaCo-2 


36.6 


Placenta 

, 


3.5 


Colon cancer tissue 


4.0 


Uterus Pool 


5.0 


Colon ca. SWl 11 6 


5.7 


Ovarian ca. OVCAR-3 


0.8 


Colon ca. Colo-205 


6.7 


Ovarian ca. SK-OV-3 


12 


Colon ca. SW-48 


17.1 


Ovarian ca. OVCAR-4 


0.5 


Colon Pool 


63 


Ovarian ca. OVCAR-5 


17.1 


Small Intestine Pool 


5.0 


Ovarian ca. IGROV-1 


2.8 


Stomach Pool 


5.4 


Ovarian ca. OVCAR-8 


5.8 


Bone Marrow Pool 


2.3 


Ovary 


7.5 


Fetal Heart 


1.0 


Breast ca. MCF-7 


1.9 


Heart Pool 


3.3 


Breast ca. MDA-MB- 


8.5 


Lymph Node Pool 


6.1 
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231 








Breast ca BT 549 


3.2 


Fetal Skeletal Muscle 


2.5 


Breast ca. T47D 


0.4 


Skeletal Muscle Pool 


9.0 


Breast ca. MDA-N 


15.6 


Spleen Pool 


1.8 


Breast Pool 


8.0 


Thymus Pool 


4.2 


Trachea 


73 


i^iNo cancer \^giio/asiro^ 
U87-MG 


1.6 


Lung 


1.5 


CNS cancer (glio/astro) 
U-118-MG 


0.9 


Fetal Lung 


11.6 


v^^iNo cancer ^neuro^niei/ 
SK-N-AS 


7.0 


Lungca.NCI-N417 


1.6 


CNo cancer (astro) or- 
539 


3.1 


Lung ca. LX-1 


49.3 


CNS cancer Castro'k SNB- 
75 


14.9 


Lungca. NCI-H146 


2.4 


CNS cancer (gjio) SNB- 
19 


4.2 


Lung ca. SHP-77 


3.5 


CNS cancer (glio) SF- 
295 


10.0 


Lung ca. A549 


1.6 


Brain (Amygdala) Pool 


0.5 


Lung ca. NCI-H526 


0.2 


Brain (cerebellum) 


1.1 


Lungca.NCI-H23 


0.8 


Brain (fetal) 


1.0 


Lung ca. NCI-H460 


1.2 


Brain (Hippocampus) 
Pool 


0.4 


Lung ca. HOP-62 


5.3 


Cerebral Cortex Pool 


0.4 


T imp ca NC1-H522 


6.5 


Brain (Substantia nigra) 
Pool 


0.3 


Liver 


21.6 


Brain (Thalamus) Pool 


0.6 


Fetal Liver 


100.0 


Brain (whole) 


3.0 


Liver ca. HepG2 


68.3 


Spinal Cord Pool 


0.4 


Kidney Pool 


9.3 


Adrenal Gland 


8.1 


Fetal Kidney 


1.8 


Pituitary gland Pool 


0.3 


Renal ca. 786-0 


0.5 


Salivary Gland 


9.3 


Renal ca.A498 


0.1 


Thyroid (female) 


1.7 


Renal ca.ACHN 


0.1 


Pancreatic ca. CAPAN2 


1.4 


Renal ca.UO-31 


2.3 


Pancreas Pool 


16.6 



General screening panel vl. 5 Summary: Ag4870 Highest expression of the NOV28 
gene, a PB39 homolog, is seen in the fetal liver. Significant levels of expression are also seen 



in cell lines derived from lung, gastric, colon, renal, liver, ovarian, breast, prostate, melanoma 
and brain cancers. This expression in proliferative samples suggests a role for the NOV28 
gene in cell proliferation and growth. This is consistent with data that shows to be 
upregulated in prostate cancer and tissues undergoing growth and differentiation. Thus, 
expression of this gene could be used to differentiate between these samples and other 
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samples on this panel and as a marker to detect the presence of these cancers. Furthennore, 
therapeutic modulation of the expression or function of this gene may be effective in the 
treatment of these cancers. 

References: 

Cole KA, Chuaqui RF, Katz K, Pack S, Zhuang Z, Cole CE, Lyne JC, Linehan WM, 
Liotta LA, Emmert-Buck MR, cDNA sequencing and analysis of POVl (PB39): a novel gene 
up-regulated in prostate cancer. Genomics 1998 Jul 15;51(2):282-7 

We recently identified a novel gene (PB39) (HGMW-approved symbol POVl) whose 
expression is up-regulated in human prostate cancer using tissue microdissection-based 
differential display analysis. In the present study we report the full-length sequencing of 
PB39 cDNA, genomic localization of the PB39 gene, and genomic sequence of the mouse 
homologue. The full-length human cDNA is 2317 nucleotides in length and contains an open 
reading frame of 559 amino acids which does not show homology with any reported human 
genes. The N-terminus contains charged amino acids and a helical loop pattern suggestive of 
an srp leader sequence for a secreted protein. Fluorescence in situ hybridization using PB39 
cDNA as probe mapped the gene to chromosome 1 Ipl Ll-pl 1 .2. Comparison of PB39 cDNA 
sequence with murine sequence available in the public database identified a region of 
previously sequenced mouse genomic DNA showing 67% amino acid sequence homology 
with human PB39. Based on alignment and comparison to the human cDNA the mouse 
genomic sequence suggests there are at least 14 exons in the mouse gene spread over 
approximately 100 kb of genomic sequence. Further analysis of PB39 expression in human 
tissues shows the presence of a unique splice variant mRNA that appears to be primarily 
associated with fetal tissues and tumors. Interestingly, the unique splice variant appears in 
prostatic intraepitfielial neoplasia, a microscopic precursor lesion of prostate cancer. The 
current data support the hypothesis that PB39 plays a role in the development of human 
prostate cancer and will be useful in the analysis of the gene product in further human and 
murine studies. 

PMID: 9722952 

Stuart RO, Pavlova A, Beier D, Li Z, Krijanovski Y, Nigam SK. EEGl, a putative 
transporter expressed during epithelial organogenesis: comparison with embryonic 
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transporter expression during nephrogenesis. Am J Physiol Renal Physiol 2001 
Dec;281(6):Fl 148-56 

A screen for genes differentially regulated in a model of kidney development 
identified the novel gene embryonic epithelia gene 1 (EEGl). EEGl exists as two transcripts 
of 2.4 and 3.5 kb that are most highly expressed at embryonic day 7 and later in the fetal 
liver, lung, placenta, and kidney. The EEGl gene is composed of 14 exons spanning a 20-kb 
region at human chromosome 1 lpl2 and the syntenic region of mouse chromosome 2. Six 
EEGl exons have previously been assigned to a longer isoform of eosinophil major basic 
protein termed proteoglycan 2. Another gene distantly related to EEGl, POV1/PB39, is 
located 88 kb upstream from the EEGl gene on chromosome 1 1. Temporal expression of 65 
members of the solute carrier (SLQ-class of transport proteins was followed during kidney 
development using DNA arrays. POV-1 and EEGl, like glucose transporters, displayed very 
early maximal gene expression. In contrast, other SLC genes, such as organic anion and 
cation transporters, amino acid permeases, and nucleoside transporters, had maximal 
expression later in development. Thus, although the bulk of transporters are expressed late in 
kidney development, a fraction are expressed near the onset of nephrogenesis. The data raise 
the possibility that EEGl and POVl may define a new family of transport proteins involved 
in the transport of nutrients or metabolites in rapidly growing and/or developing tissues. 

PMID: 11704567 

S. NOV31 - CG57344-01 and CG57344-02: Myelin P2-Iike 

Expression of NOV3 1 gene was assessed using the primer-probe set Ag3205, 
described in Table SA. Results of the RTQ-PCR runs are shown in Tables SB and SC. 

Table SA . Probe Name Ag3205 



Primers 


Sequences 


Length 


Start Position 


SEQID 
NO: 


Forward 


5-accagctccaaggaacatg-3' 


19 


28 


527 


Probe 


TET-5-tccatttcttgtgaaaattccgaaga-3*-TAMRA 


26 


51 


528 


Reverse 


5*-ttcctatacccagctccttca-3 * 


21 


82 


529 



Table SB. Panel 1.3D 



Tissue Name 


ReL Exp.(%) Ag3205, 
Run 165527062 


Tissue Name |ReL Exp,(o/o) Ag3205, 

|Run 165527062 


Liver adenocarcinoma 


0.0 


Kidney (fetal) |0.0 


Pancreas 


0.0 


Renal ca. 786-0 |0.0 
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Pancreatic ca. CAPAN 2 


6.0 


Renal ca. A498 f 


A A 


Adrenal gland 


4.4 


Renal ca. RXF 393 


0.0 


Thyroid 


0.0 


Renai ca. ACHN 


0.0 


Salivary gland 


0.0 


Renal ca. UO-31 


0.0 


Pituitary gland 


0.0 


Renal ca. TK-10 


5.4 


Brain (fetal) 


10.4 


Liver 


0.0 


Brain (whole) 


92 


Liver (fetal) 


0.0 


Brain (amygdala) 


0.0 


Liver ca. (hepatoblast) 
HepG2 


A A 

0.0 


Brain (cerebellum) 


0.0 


Lung 


5.4 


Brain (hippocampus) 


0.0 


Lung (fetal) 


0.0 


Brain (substantia nigra) 


0.0 


LX-1 


0.0 


Brain (thalamus) 


5.5 


T iiTiP ( ^mall cell^ 
NCI-H69 


0.0 


Cerebral Cortex 


3.5 


T iino" ra cpW vfir ^ 

SHP-77 


5.1 


Spinal cord 


0.0 


Lung ca. (large 
cell)NCI-H460 


0.0 


glio/astro U87-MG 


0.0 


Lung ca. (non-sm. cell) 
A549 


0.0 


glio/astroU-118-MG 


0.0 


Lung ca. (non-sxell) 
iNCI-H23 


0.0 


astrocytoma SW1783 


_ iLung ca. (non-s.cell) 
|hOP-62 


0.0 


neuro*; met SK-N-AS 


0.0 


iLung ca. (non-s.cl) 

NCLH522 


0.0 


astrocytoma SF-539 


0.0 


Lung ca. (squam.) SW 
900 


5.3 


astrocytoma SNB-75 


0.0 


Lung ca. (squam.) 
NCI-H596 


4.7 


glioma SNB-19 


0.0 


Mammary gland 


7.3 


glioma U251 


3.6 


Breast ca.* (pl.ef) 

MCF-7 


0.0 


giiuind. zyV'-z.yD 


0.0 


Breast ca.* (pl.ef) 
MDA-MB-231 




Heart (fetal) 


0.0 


Breast ca.* (pl.ef) 
T47D 


6.8 


Heart 


2.9 


Breast ca. BT-549 


4.9 


Skeletal muscle (fetal) 


0.0 


Breast ca. MDA-N 


0.0 


Skeletal muscle 


6.6 


Ovary 


U.U 


Bone marrow 


0.0 


Ovarian ca. OVCAR-3 


0.0 


Thymus 


4.1 


Ovarian ca OVCAR-4 


0.0 


Spleen 


0.0 


Ovarian ca. OVCAR-5 


0.0 


Lymph node 


0.0 


Ovarian ca. OVCAR-8 


0.0 


Colorectal 


0.0 


Ovarian ca. lGROV-1 


0.0 


Stomach 


0.0 


Ovarian ca.* (ascites) 


0.0 
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SK-OV-3 




Small intestine 


0.0 


Uterus 


0.0 


Colon ca. SW480 


0.0 


Placenta 


0.0 


Colon ca.* 
SW620(SW480 met) 


0.0 


Prostate 


0.0 


Colon ca. HT29 


0.0 


Prostate ca.* (bone 
met)PC-3 


0.0 


Colonca.HCT-n6 


0.0 


Testis 


100.0 


Colon ca. CaCo-2 


0.0 


Melanoma Hs688(A).T 


33 


Colon ca. 
tissue(OD03866) 


0.0 


Melanoma* (met) 
Hs688(B).T 


A A 

0.0 


Colon ca. HCC-2998 


0.0 


Melanoma UACC-62 


0.0 


Gastric ca.* (liver met) 
NCI-N87 


14.2 


Melanoma Ml 4 


0.0 


Bladder 


0.0 


Melanoma LOX IMVI 


0.0 


Trachea 


0.0 


Melanoma* (met) SK- 
MEL-5 


0.0 


Kidney 


0.0 


Adipose 


0.0 



Table SC. Panel 4D 



Tissue Name 


Rel. Exp.(%)Ag3205, 
Run 164531686 


Tissue Name 


ReLExp.(%)Ag3205, 
Run 164531686 


Secondary Thl act 


0.0 


HUVEC Il>lbeta 


0.0 


Secondary Th2 act 


0.0 


HUVEC IFN gamma 


0.0 


Secondary Trl act 


0.0 


HUVEC TNF alpha + IFN 
gamma 


0.0 


Secondary Thl rest 


0.0 


HUVEC TNF alpha + IL4 


1.1 


Secondary Th2 rest 


0.0 


HUVEC li^ll 


0.5 


Secondary Trl rest 


0.0 


Lung Microvascular EC none 


1.5 


Primary Thl act 


0.0 


Lung Microvascular EC 
TNFalpha + IL- 1 beta 


0.0 


Primary Th2 act 


0.0 


Microvascular Dermal EC 
none 


2.5 


Primary Trl act 


0.0 


Microsvasular Dermal EC 
TNFalpha + IL-1 beta 


1.9 


Primary Thl rest 


0.0 


Bronchial epithelium 
TNFalpha + ILlbeta 


1.2 


Primary Th2 rest 


0.0 


Small airway epithelium 
none 


0.6 


Primary Trl rest 


0.7 


Small airway epithelium 
TNFalpha +IL-1 beta 


1.1 


CD45RA CD4 
lymphoc)^e act 


0.0 


Coronery artery SMC rest 


0.0 


CD45RO CD4 

lymphocyte act 


0.0 


Coronery artery SMC 
TNFalpha +IL-1 beta 


0.0 


CD8 lymphocyte act 


0.0 


Astrocytes rest 


0.0 


Secondary CD8 


0.0 


Astrocytes TNFalpha + IL- 


0.0 
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lymphocyte rest 




Ibeta j 


oeconQary v^jl/o 
lymphocyte act 


0.0 


KU-812 (Basophil) rest 


0.0 


CD4 lymphocyte none 


0.0 


KU-812 (Basophil) 
PKf A /innomvcin 


0.0 


2rvThl/Th2/Trl anti- 
CD95 CHll 


0.0 


CCDl 106 nCeratinocvtes'i 
none 


0.0 


LAK cells rest 


0.0 


PPm 1 06 rKeratinocvtes'fc 
TNFalpha-i-IL-lbeta 


0.0 


LAK cells IL-2 


0.0 


Liver cirrhosis 


0.0 


LAK cells IL-2-ML-12 


0.0 


Lupus kidney 


0.0 


LAK cells IL-2+IFN 
gamma 


u.u 


NCI-H292 none 


100.0 


LAK cells IL-2+IL-18 


0.0 


NCI-.H292 IL-4 


1.4 


LAK cells 
PMA/ionomycin 


A A 
O.U 


NCI-H292 IL-9 


3.1 


NK Cells IL-2 rest 


0.0 


NCI-H292 lL-13 


0.0 


Two Way MLR 3 day 


0.0 


NCI-H292 IFN gamma 


0.0 


Two Way MLR 5 day 


0.0 


HPAEC none 


0.0 


Two Way MLR 7 day 


0.0 


r\r/\cA^ i iNr aipna ^ iLr 1 

beta 


0.0 


PBMC rest 


0.0 


Lung fibroblast none 


0.0 


PBMC PWM 


0.0 


jLung iiDroDiasT i iNr aipna ^ 
lL-1 beta 


0.0 


PBMC PHA-L 


0.0 


Lung iiDroDiasi iju-'t 


0 0 


RfiTrm<; (Ti celH none 


0.0 


Lung fibroblast IL-9 


0.0 


Ramos (B cell) 

1 nnom vein 

I WllV/lH Y Will 


0.0 


Lung fibroblast lL-13 


0.0 


B lymphocytes PWM 


0.0 


Lung fibroblast IFN gamma 


0.0 


T> tiinn.Tli nn-.r^-nnr ^^TXAfVf 

3 lympnocytes CLH UL 
and IL-4 


0.0 


Dermal fibroblast CCDl 070 
rest 


0.0 


EOL-1 dbcAMP 


0.0 


Dermal fibroblast CCD1070 

1 iNjr ctjpiici 


0.0 


EOL-1 dbcAMP 
PMA/ionomycin 


0.0 


IL-1 beta 


0.0 


Dendritic cells none 


A A 

u.u 


Dermal fibroblast IFN 

gamma 


0.0 


Dendritic cells LPS 


0.0 


Dermal fibroblast IL-4 


0.0 


Dendritic cells anti- 
CD40 


0.0 


IBD Colitis 2 


0.0 


Monocytes rest 


0.0 


IBD Crohn's 


1.2 


Monocytes LPS 


0.0 


Colon 


0.0 


Macrophages rest 


0.0 


Lung 


0.9 


Macrophages LPS 


0.0 


Thymus 


0.0 


HUVEC none 


1.4 


Kidney 


1.1 


HUVEC starved 


0.0 
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Panel 1.3D Summary: Ag3205 Expression of the NOV31 gene, which is homologous to 
myelin P2, is restricted to a sample derived from the testis (CT=34.7). Thus, expression of 
this gene could be used to differentiate between this sample and other samples on this panel 
and as a marker of this tissue. Furthermore, the specific pattern of suggestion suggests that 
therapeutic modulation of this protein product may be useful in the treatment of male 
infertility or hypogonadism. 

References: 

Schmitt MC, Jamison RS, Qrgebin-Crist MC, Ong DE. A novel, testis-specific 
member of the cellular lipophilic transport protein superfamily, deduced from a 
complimentary deoxyribonucleic acid clone. Biol Reprod 1994 Aug;51(2):239-45 

A novel member of the cellular lipophilic transport protein superfamily was identified 
after an antiserum raised against cellular retinoic acid-binding protein (CRABP) was found 
also to contain antibodies against another 15-kDa protein present in the cytosol of pubertal 
and adult rat testis. These antibodies were used to screen a rat testis cDNA expression library 
and isolate a 561-bp clone containing a fiill open reading frame from which the sequence of a 
novel 132 amino acid protein was deduced. The protein has 58% amino acid sequence 
identity to bovine myelin P2, 58% identity to murine adipocyte lipid-binding protein, and 
40% identity to rat CRABP. Although the endogenous ligand has not yet been identified, 
conservation of residues involved in the binding of carboxylate groups suggests that the 
ligand is a fatty acid or an acidic retinoid. Tissue-specific expression was examined by 
Northern analysis and immunolocalization and appears to be restricted to late germ cells 
within the testis and epididymis. Immunostaining was first detectable in mid-pachytene 
spermatocytes and increased in intensity as these cells progressed to elongated spermatids, 
suggesting that this testis lipid-binding protein has a specific role in sperm development. 

PMID: 7948479 

Panel 4D Summary: Ag3205 Expression of the CG57344-01 gene is restricted to a sample 
derived from untreated NCI-H292 cells (CT=31 .9). Thus, expression of this gene could be 
used as a marker of this cell type. 

T. NOV32 - CG57346-01 and CG57346-02: TESTIS LIPID BINDING PROTEIN 
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Expression of the NOV32 gene was assessed using the primer-probe set Ag3206, 
described in Table TA. Results of tiie RTQ-PCR runs are shown in Tables TB and TC. 



Table TA . Probe Name Ag3206 



Primers 


2 

Sequences fLength 


Start Position 


SEQID 

NO: 


Forward 


5*-agtgttgatgggaaaatgatga-3* |22 


160 


530 


Probe 


TET-5*-ccataagaacagaaagttctttccaggaca-3-TAMRA|30 


182 


531 


Reverse 


5'-ccccagcttgaaggagatc-3' j 1 9 


216 


532 



Table TB . Panel 13D 



1 issue iName 


ReLExp.(%)Ag3206, 
Run 165527079 


[Tissue Name 


Rel.Exp.(%)Ag3206, 
Run 165527079 


Liver adenocarcinoma 


0.0 


^idney (fetal) 


10.5 


Pancreas 


0.0 


[Renal ca. 786-0 


0.0 


Pancreatic ca. CAPAN 2 


17.8 


SRenal ca. A498 


15.6 


Adrenal gland 


0.0 


|Renalca.RXF393 


0.0 


1 Hyrvjiu 


0.0 


f Renal ca. ACHN 




Salivary gland 


0.0 


iRenalca. UO-31 


0.0 


Pituitary gland 


0.0 


[Renal ca. IX-IO 


0.0 


Brain (fetal) 


0.0 


iLiver 


0.0 


Brain (v^ole) 


9.9 jLiver (letal) 


0.0 


J-* -I Will ^tiJii y ^vjcijLt / 


0.0 


iLiver ca. (hepatoblast) 

3HepG2 


0.0 


Brain (cerebellum) 


0.0 |Lung 


0.0 


Brain (hippocampus) 


0.0 jLung (fetal) 


0.0 


Brain (substantia nigra) 


0.0 


iLungca. (small cell) 


4.5 


Brain (Aalamus) 


0.0 


ILung ca. (small cell) 
tNCI-H69 


0.0 


Cerebral Cortex 


0.0 


Lung ca. (s.cell var.) 
SHP-77 


18.0 


Spinal cord 


33.4 


Lung ca. (large 
cell)NCI-H460 


41.2 


glio/astroU87-MG 


0.0 


Lung ca. (non-sm. cell) 
A549 


0.0 


glio/astroU-118-MG 


0.0 


Lung ca (non-s.cell) 
NCI-H23 


0.0 


astrocytoma SW1783 


0.0 


Lung ca. (non-s.cell) 
HOP-62 


0.0 


neuro*; met SK-N-AS 


0.0 


Lung ca. (non-s.cl) 
NCI-H522 


0.0 


astrocytoma SF-539 


0.0 


Lung ca. (squam.) SW 
900 


0.0 


astrocytoma SNB-75 


11.7 


Lung ca. (squam.) 
NCI-H596 


0.0 
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glioma SNB- 19 


0.0 


Mammary gland 


14.4 


glioma U25I 


0.0 


Breast ca.* (pl.ef) 

MCF-7 


0.0 


glioma SF-295 


0.0 


Breast ca.* (pl.ef) 
MDA-MB-23 1 


0.0 


Heart (fetal) 


0.0 


T47D 

^. „,„ „ ^„ .„ 


0.0 


Heart 


15.5 


Breast ca. BT-549 


0.0 


Skeletal muscle (fetal) 


0.0 


Breast ca. MDA-N 


0.0 




0 0 


Ovary 


ft ft 




ft ft 


v-^vdndn cd. vJvv^/\ivo 


1 A ft 




ft ft 


L/vanan ca, wvoak.-^ 


ft ft 


Spleen 


0.0 


Ovarian ca. OVCAR-5 


0.0 


Lymph node 


0.0 


Ovarian ca. OVCAR-8 


0.0 


Colorectal 


0.0 


Ovarian ca IGROV-1 


11.6 


Stomach 


0.0 


Ovarian ca.* (ascites) 
SK-OV-3 


0.0 


Small intestine 


0.0 


Uterus 


0.0 


Colon ca. SW480 


0.0 


Placenta 


0.0 


Colon ca.* 
SW620(SW480 met) 


0.0 


Prostate 


0.0 


Colon ca. HT29 


0.0 


Prostate ca.* (bone 
met)PC-3 


100.0 


Colon ca. HCT-116 


0.0 


Testis 


27.5 


Colon ca. CaCo-2 


42.0 


Melanoma Hs688(A).T 


0.0 


Colon ca. 
tissuetUiX/3ooo) 


0.0 


Melanoma* (met) 
Hs688(B).T 


0.0 


Colon ca. HCC-2998 


0.0 


Melanoma UACC-62 


0.0 


Gastric ca.* (liver met) 
NCI-N87 


ao 


Melanoma Ml 4 


0.0 


Bladder 


0.0 


Melanoma LOX IMVl 


0.0 


Trachea 


0.0 


Melanoma* (met) SK- 
MEL-5 


0.0 


Kidney 


0.0 


Adipose 


0.0 



lablelC. Panel 4D 



Tissue Name 


ReLExp.(%)Ag3206, 
Run 164531735 


Tissue Name 


Rel.Exp.(%)Ag3206, 
Run 164531735 


Secondary Thl act 


0.0 


HUVECIL-lbeta 


0.0 


Secondary Th2 act 


11.9 


HUVECIFN gamma 


0.0 


Secondary Trl act 


0.0 


HUVEC TNF alpha + IFN 

gamma 


12.6 


Secondary Thl rest 


11.9 


HUVEC TNF alpha + IL4 


15.9 


Secondary Th2 rest 


0.0 


HUVECIL-11 


0.0 


Secondary Trl rest 


0.0 


Lung Microvascular EC none 


75.8 


F*rimaiy Thl act 


0.0 


Lung Microvascular EC 


100.0 
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FNFalpha+IL-lbeta 




Primary Th2 act ( 


1 

).0 


Vlicrovascular Dermal EC 
lone 


72.2 


Primary Trl act < 


] 

[).0 


Vlicrosvasular Dermal EC ^ 
rNFalpha + IL-lbeta 


).0 


Primary Thl rest 


] 

o.o 


Bronchial epithelium \ 
TNFalpha+lLlbeta 


).0 


Primary Th2 rest 


0.0 


Small airway epithelium 
none 


o.o 


Primary Trl rest 


0.0 


Small airway epithelium 
TNFalpha + IL-lbeta 


0.0 


CD45RA CD4 
lymphocyte act 


0.0 


Coronery artery SMC rest jo.O 


CD45RO CD4 
lymphocyte act 


0.0 


Coronery artery SMC 
TNFalpha + IL-lbeta 


0.0 


CDS lymphocyte act 


0.0 


Astrocytes rest 


0.0 


secondary CJJo 
lymphocyte rest 


0.0 


Astrocytes TNFalpha + IL- 
Ibeta 


0.0 


Secondary CL)o 
lymphocyte act 


0.0 


KU-ol-dJ ^JtJasopnuj rest 


U.vr 


CD4 lymphocyte none 


0.0 


KU-812 (Basophil) 
PMA/ionomycin 


u.u 


2ry Thl/Th2/Trl anti- 
CD95 CHI 1 


0.0 


CCDl 106 (Keratinocytes) 
none 


0.0 


LAK cells rest 


0.0 


CCDl 106 (Keratinocytes) 
TNFalpha + IL-1 beta 


0.0 


LAK cells IL-2 


0.0 


Liver cirrhosis 


29.7 


LAK cells IL-2+IL'12 


0.0 


Lupus kidney _ 


0.0 


LAK cells IL-2+IFN 

gamma 




NCI-H292 none 


97.3 


LAK cells IL-2+ IL-18 


0.0 


NCI-H292 IL-4 


0.0 


LAK cells 
PMA/ionomycin 


\J,\J 


NCI-H292 IL-9 


43.8 


NK Cells IL-2 rest 


7.8 


NC1-H292 IL-13 


24.0 


ITwo WayMLR3day 


0.0 


NCI-H292 IFN gamma 


12.7 


Two Way MLR 5 day 


0.0 


HPAEC none 


14.3 


Two Way MLR 7 day 


0.0 


HPAEC TNF alpha + IL-1 

beta 


33.2 


PBMC rest 


0.0 


Lung fibroblast none 


0.0 


PBMC PWM 


0.0 


Lung fibroblast TNF alpha + 
IL-1 beta 


0.0 


PBMC PHA-L 


0.0 


Lung fibroblast IL-4 


0.0 


Ramos (B cell) none 


0.0 


Lung fibroblast IL-9 


16.2 


Ramos (B cell) 
ionomycin 


0.0 


Limg fibroblast IL-I3 


15.9 


B lymphocytes PWM 


0.0 


Lung fibroblast IFN gamma 


0.0 


B lymphocytes CD40L 
and IL-4 


0.0 


Dermal fibroblast CCDl 070 
rest 


0.0 
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EOL-1 dbcAMP 


0,0 


Dermal fibroblast CCD1070 
TNF alpha 


0.0 


EOL-1 dbcAMP 

X iVlrV/ i KMlKJlXiy Lrll 1 


0.0 


Derraal fibroblast CCD1070 
n -1 hptji 

xLf" 1 UCLtt 


0.0 


Dendritic cells none 


0.0 


T^rm'il fiKrrbhloct 1T7NI 
JL/clliJal llDrODIdSl iriN 

gamma 


0.0 


Dendritic cells LPS 


0.0 


Dermal fibroblast IL-4 


15.0 


Dendritic cells anti- 
CD40 


0.0 


IBD Colitis 2 


21 A 


Monocytes rest 


0.0 


IBD Crohn's 


0.0 


Monocytes LPS 


0.0 


Colon 


6.7 


Macrophages rest 


0.0 


Lung 


0.0 


Macrophi^es LPS 


0.0 


Thymus 


0.0 


HU VEC none 


27.7 


Kidney 


0.0 


HUVEC starved 


20.0 







Panel 1.3D Summary: Ag3206 Expression of the NOV32 gene is restricted to a sample 
derived from a prostate cancer cell line (CT=34.9), Thus, expression of this gene could be 
used to differentiate between this sample and other samples on this panel and as a marker to 
detect the presence of prostate cancer. Furthermore, therapeutic modulation of the expression 
or function of this gene may be effective in the treatment of prostate cancer. 



Panel 4D Summary: Ag3206 Expression of the NOV32 gene is primarily restricted to a 
cluster of samples derived from microvasculature of the lung and the dermis suggesting a role 
for this gene in the maintenance of the integrity of the microvasculature. Therefore, 
therapeutics designed for this putative protein could be beneficial for the treatment of 
diseases associated with damaged microvasculature including heart diseases or inflammatory 
diseases, such as psoriasis^ asthma, and chronic obstructive pubnonary diseases. 

U. NOV33 - CG57356-01: novel intracellular thrombospondin domain 
containing protein 

Expression of the NOV33 gene was assessed using the primer-probe set Ag672, 
described in Table UA. Results of the RTQ-PCR runs are shown in Table UB. 

Table UA . Probe Name Ag672 



Primers 


Sequences iLength 


Start Position 


SEQID 
NO: 


Forward 


5'-ccagatcctttctccttgatct-3' (22 il076 


533 


Probe 


TEI^S'-ccaaactttccagatctttccaaagctg-S'-TAMRAbS \ 1 047 


534 


Reverse 


5-tgacctggatatttggattctg-3* j22 


1014 


535 
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Table UB. Panel 1.1 





ReLExp.(%)Ag672, 
Run 109210121 




ReLExp.(%)Ag672, 
Run 109210121 


Adrenal gland 


3.8 


Renal ca.UO-31 


0.1 


Bladder 


24.3 


Renal ca.RXF 393 


11.6 


Brain (amygdala) 


0.2 


Liver 


5.0 


Brain (cerebellum) 


3.7 


Liver (fetal) 


0.0 


Brain (hippocampus) 


3.1 


HepG2 


0.6 


jt>rain ^^suDSianua nigra^ 


IJ. A 

IH-.H- 


Lung 




Brain (thalamus) 


10.4 


Lung (fetal) 


57.4 


Cerebral Cortex 


3.2 


Lung ca. (non-sxell) 
HOP-62 


0.6 


Brain (fetal) 


13 


Lung ca. (large 
celi)NCl-H460 


4.9 


Brain (whole) 


1.5 


Lung ca. (non-s.cell) 
NC1-H23 


0.1 


glio/astroU-118-MG 


15.3 


Lung ca. (non-sxl) 
NCI-H522 


0.0 


astrocytoma SF-539 


0.0 


Lung ca. (non-sm. ceil) 
A549 


23 


astrocytoma SNB-75 


0.6 


Lung ca. (s.cell var.) 

CUD T'"? 


0.1 


astrocytoma SW1783 


0.0 


Lung ca, (small cell) 
LX-I 


11.6 


glioma U251 


1.0 


Lung ca. (small cell) 
NC1-H69 


17.7 


giiOma or-zy3 


A A 


Lung ca. (squam.) SW 
900 




glioma oiNi:)- 1 y 




Lung ca. (squam.) 
NCI-H596 




glio/astro U87-MG 


3.7 


Lymph node 


3.4 


neuro*; met SK-N-AS 


0.0 


Spleen 


3.8 


Mammary gland 


43.5 


Thymus 


2.5 


Breast ca. BT-549 


0.8 


Ovary 


0.8 


Breast ca. MDA~N 


0.1 


Ovarian ca. IGROV-1 


5.0 


Breast ca.* (pl.ef) T47D 


9.6 


Ovarian ca. OVCAR-3 


6.3 


Breast ca.* (pl.ef) MCF- 
7 


2.7 


Ovarian ca. OVCAR-4 


0.0 


Breast ca.* (pl.ef) MDA- 
MB-231 


1.2 


Ovarian ca. OVCAR-5 


6.3 


Small intestine 


62 


Ovarian ca. OVCAR-8 


1.8 


Colorectal 


OA 


Ovarian ca.* (ascites) 
SK-OV-3 


2.7 


Colon ca. HT29 


0.0 


Pancreas 


54.0 


Colon ca. CaCo-2 


2.1 


Pancreatic ca. CAPAN 
2 


0.1 
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uoioii ca. Jric J -i -> 


1 7 
1./ 


Pituitary gland 


1 id n 


Colon ca.HCT-1 16 


L7 


Placenta 


100.0 


Colon ca. HCC-2998 


2.2 


Prostate 


1.0 


Colon ca. SW480 


19.1 


Prostate ca.* (bone 
mei^ rv^o 


0.0 


i-^OlOu Ca. oWDZU 

(SW480 met) 


0.8 


Salivary giand 


42.3 


Stomach 


6.7 


Trachea 


23.8 


Gastric ca, (liver met) 
NCI-N87 


9.5 


Spinal cord 


6.7 


Heart 


23.0 


Testis 


0.0 


Skeletal muscle (Fetal) 


1.6 


Thyroid 


78.5 


Skeletal muscle 


13.1 


Uterus 


1.6 


Endotnehai cells 


0.2 


Melanoma M 14 


0.0 


Heart (Fetal) 


.4.2 


Melanoma LOX IMVI 


0.0 


Kidney 


6.4 


Melanoma UACC-62 


0.0 


Kidney (fetal) 


5.0 


Melanoma SK-MEL- 

28 


65.5 


Renal ca. 786-0 


2.0 


Melanoma* (met) SK- 
MEL-5 


37.1 


Renal ca. A498 


13.2 


Melanoma Hs688(A).T 


0.0 


Renal ca. ACHN 


0.1 


Melanoma* (met) 
Hs688(B).T 


0.0 


Renal ca. TK-10 


36J 







Panel U Summary: Ag672 The results obtained in this experiment are comparable to what 
is observed in Panel 1 . Expression of the NOV33 gene is primarily associated with normal 
tissues on this panel. Highest expression is seen in placenta (CT = 25), thyroid (CT = 25.2), 
pancreas (CT = 25.7), and mammary gland (CT = 26). Therefore, the NOV33 gene might be 
useful as a marker to distinguish these tissues. In addition, the observed expression in 
mammary gland and placenta suggests a potential role for the NOV33 gene product in 
pregnancy. Interestingly, expression of this gene is much lower in 5/5 breast cancer cell lines 
when compared to normal breast. This suggests that replacement of the NOV33 gene product 
using protein therapeutics, peptides or gene therapy would be valuable in the treatment of 
breast cancer. 

In addition, the NOV33 gene is expressed throughout the CNS with low to moderate 

expression detected in amygdala, cerebellum, hippocampus, substantia nigra, thalamus and 

cerebral cortex. Expression of this gene is decreased in CNS cancer cell lines relative to 

normal brain tissues. The secreted protein encoded for by the NOV33 gene contains 

homology to thrombospondin, ^n^g^sym^ it may play a role in inhibiting angiogenesis. 

Therefore, treatment with the NOV33 protein, or in vivo modulation of the gene or the 
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protein product may therefore be of use in slowing the growth/ inhibiting QMS tumors. 
Selective removal of this protein via synthetic antibodies may help to increase vascularization 
in CNS tissue undergoing repair/regeneration. 

Among the metabolically relevant tissues, the NOV33 gene is expressed at high levels 
5 in thyroid and pancreas and at more moderate levels in adrenal gland, pituitary gland, heart, 
and skeletal muscle. Therefore, this gene product may have utility as a drug treatment for any 
or all diseases of the thyroid gland and pancreas as well as other metabolic and 
neuroendocrine diseases. Interestingly, this gene is more highly expressed in adult liver (CT 
= 29) than in fetal liver (CT = 40), suggesting that the NOV33 gene would be a useful marker 
10 for differentiating between the adult and fetal liver. 

V. NOV34a - CGS7258-01: ornithine decarboxylase 

Expression of the NOV34a gene was assessed using the primer-probe set Ag3 1 48, 
described in Table VA. Results of the RTQ-PCR runs are shown in Tables VB, VC, VD and 
VE. 

15 Table VA . Probe Name Ag3148 



Primers 


Sequences 


Length 


Start Position 


SEQ ID NO: 


Forward 


5'«acctgctgaaggaactcactct-3* 


22 


112 


536 


Probe 


TET-5-ctcacaggacg^gtagctgccttct-3-TAMRA 


26 


140 


537 


Reverse 


5-gcaaaagtgcttcctcactatg-3' 


22 


185 


538 



Table VB . CNS jieurodegeneration_v 1 .0 



Tissue Name 


Rel. Exp.(%) 
Ag3148,Run 
209057912 


Rel. Exp.(%) 
Ag3148,Run 
249265913 


Tissue Name 


Rel. E>q).(%) 
Ag3148,Run 
209057912 


Rel. Exp.(%) 
Ag3148,Run 
249265913 


AD 1 Hippo 


14.6 


13.9 


Control (Path) 
3 Temporal 
Ctx 


4.9 


3.7 


AD 2 Hippo 


37.6 


27.7 


Control (Path) 
4 Temporal 
Ctx 


22.2 


26.1 


AD 3 Hippo 


8.2 


7.3 


ADl 

Occipitai Ctx 


8.8 


8.6 


AD 4 Hippo 


12.8 


12.9 


AD 2 

Occipital Ctx 
(Missing) 


0.0 


0.0 


AD 5 Hippo 


67.4 


49.7 


AD 3 

Occipital Ctx 


5.6 


1.7 


AD 6 Hippo 


29.1 


29.7 


AD 4 

Occipital Ctx 


21.8 


21.2 
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Control 2 
Hippo 


51.8 


50.0 


AD 5 

Occipital Ctx 


71.2 


92 


Control 4 
Hippo 


14.9 


17.0 


AD 6 

Occipital Ctx 


9.5 


59.0 


Control (Path) 
3 Hippo 


4.1 


3.7 


Control 1 
Occipital Ctx 


6.9 


32 


ADl 

Temporal Ctx 


1 9. 




Control 2 
Occipital Ctx 


77 0 


77 4. 


AD 2 

Temporal Ctx 


35.6 


38.2 


Control 3 
Occipital Ctx 


17.1 


14.2 


ADS 

Temporal Ctx 


2.0 


5.5 


Control 4 

Occipital Ctx 


9.2 


8.8 


AD4 

Temporal Ctx 


203 


14.9 


Control (Path) 
1 Occipital 
Ctx 


100.0 


78.5 


AD 5 Inf 
Temporal Ctx 


79.6 


70.2 


Control (Path) 
2 Occipital 
Ctx 


9.3 


8.1 


AD 5 Sup 
Temporal Ctx 


35.8 


30.8 


Control (Path) 
3 Occipital 
Ctx 


2.3 


1.1 


AD 6 Inf 
Temporal Ctx 


19.2 


16.5 


Control (Path) 
4 Occipital 

Ctx 


10.1 




12.8 


AD 6 Sup 
Temporal Ctx 


15.7 


28.5 


Control 1 
Parietal Ctx 


11.3 


12.0 


Control 1 
Temporal Ctx 


6.7 


7.7 


Control 2 
Parietal Ctx 


32.8 


37.1 


Control 2 

Temporal Ctx 


Do.O 


o/.o 


Control 3 
Parietal Ctx 


IQ 1 


lO.O 


Control 3 
Temporal Ctx 


163 


17.9 


Control (Path) 
1 Parietal Ctx 


87.1 


100.0 


Control 3 
Temporal Ctx 


10.4 


9.5 


Control (Path) 
2 Parietal Ctx 


18.3 


18.7 


Control (Path) 
1 Temporal 
Ctx 


72.7 


69.7 


Control (Path) 
3 Parietal Ctx 


4.4 


2.0 


Control (Path) 
2 Temporal 
Ctx 


31.0 


31.0 


Control (Path) 
4 Parietal Ctx 


26.6 


3.4 



Table VC . Panel 1.3D 



Tissue Name 


Rel.Exp.(%)Ag3148, 
Run 167994492 


Tissue Name 


Rel.Exp.(%)Ag3148, 
Run 167994492 


Liver adenocarcinoma 


27.4 


{Kidney (fetal) 


15.4 


Pancreas 


6.5 


jRenal ca. 786-0 


5.9 


Pancreatic ca. CAPAN 2 


2.9 


jRenal ca. A498 


3.5 


Adrenal gland 


2.9 


iRenal ca. RXF 393 


13.6 


Thyroid 


8.2 


iRenal ca. ACHN 


10.9 
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Salivary gland 


2.1 


Renal ca. UO-31 


0.0 


Pituitary gland 


7.6 


R^ca.TK-10 


3.7 


Brain (fetah 


493 


Liver 


0.0 


Brain (whole) 


51.8 


Liver (fetal) 


0.0 


Brain (amygdala) 


32.8 


Liver ca. (hepatoblast) 
HepG2 


3.2 


Brain (cerebellum) 


Dy.K) 


Lung 


z. / 


Brain (hippocampus) 


40.9 


Lung (fetai) 


13.8 


Brain (substantia nigra) 


68.3 


Lung ca. (small cell) 
LX-1 


2.3 


Brain (thalamus) 


49.3 


Limg ca. (small cell) 
NCI-H69 


1.] 


Cerebral Cortex 


100.0 


Lung ca. (s.cell var.) 
SHP-77 


5.6 


Spinal cord 


41.8 


Lung ca. (Izirge 
cell)NCI-H460 


1.3 


glio/astroU87-MG 


93 


Lung ca. (non-sm. cell) 
A549 


7.4 


glio/astroU-118-MG 


0.0 


Limg ca. (non-s.celi) 
NCI-H23 


1.2 


astrocytoma SW1783 


2.6 


Lung ca. (non-s.cell) 
HOP-62 


21.9 


neuro*; met SK-N-AS 


1.4 


Lung ca. (non-s.cl) 
NCI-H522 


7.2 


astrocytoma SF-539 


2.8 


Lung ca. (squam.) SW 
900 


3.8 


astrocytoma SNB-75 


6.6 


Lung ca. (squam.) 
NCI-H596 


0.0 


glioma SNB-19 


15.8 


Mammary gland 


13.8 


glioma U251 


57.8 


Breast ca.* {pl.ei) 
MCF-7 


0.8 


~ ' ■ 

glioma SF-295 


22.4 


Breast ca.* (pi.ef) 
MDA--MB-231 


12.7 


Heart (fetal) 


36.9 


Breast ca.* (pl.ef) 
T47D 


too 

13.8 


Heart 


4.1 


Breast ca. BT-549 


1.4 


Skeletal muscle (fetal) 


13.4 


Breast ca, MDA-N 


1.8 


Skeletal muscle 


6.0 


Ovary 


22.2 


Bone marrow 


0.0 


Ovarian ca. OVCAR-3 


0.6 


Thymus 


3.8 


Ovarian ca. OVCAR-4 


10.0 


Spleen 


0.5 


Ovarian ca. OVCAR-5 


19.5 


Lymph node 


3.7 


Ovarian ca. OVCAR-8 


3.6 


Colorectal 


3.5 


Ovarian ca. IGROV-1 


3.2 


Stomach 


2.0 


Ovarian ca.* (ascites) 
SK-OV-3 


143 


Small intestine 


7.5 


Uterus 


7.5 


Colon ca.SW480 


33 


Placenta fO.O 
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Colon ca.* 
SW620(SW480 met) 


6.8 


Prostate 


4.4 


Colon ca. HT29 


0.0 


Prostate ca,* (bone 
met)PC-3 


3.3 


Colon ca.HCT-1 16 


33 


Testis 


61.6 


Colon ca. CaCo-2 


1.7 


Melanoma Hs688(A).T 


1.1 


iwOion ca. 
tlssue(OD03866) 


3.2 


iVlClclliUJild. ^iiieiy 

Hs688(B).T 


2.0 


Colon ca. HCC-2998 


1.2 


Melanoma UACC-62 


14.9 


Gastric ca.* (liver met) 
NCI-N87 


9.6 


Melanoma Ml 4 


2.6 


Bladder 


1.6 


Melanoma LOXIMVI 


4.7 


Trachea 


2.0 


Melanoma* (met) SK- 
MEL-5 


7.3 


Kidney 


13.2 


Adipose 


2.9 



Table VP. Panel 4D 



Tissue Name 


ReLExp.(%)Ag3I48; 
Run 164528041 


Tissue Name 


ReLExp.(%)Ag3148, 
Run 164528041 


Secondary Thl act 


25.9 


HUVEC IL-lbeta 


0.0 


Secondary Th2 act 


18.6 


HUVEC IFN gamma 


2.6 


Secondary Trl act 


8.4 


HUVEC TNF alpha + IFN 
gamma 


8.3 


Secondary Thl rest 


14.1 


HUVEC TNF alpha + llA 


6.8 


Secondary Th2 rest 


12.1 


HUVEC IL-11 


3.9 


Secondary Trl rest 


24.7 


Lung Microvascular EC none 


10.4 


Primary Thl act 


16.5 


Lung Microvascular EC 
TNFalphaH- IL-lbeta 


12.8 


Primary Th2 act 


16.0 


Microvascular Dermal EC 
none 


7.5 


Primary Trl act 


4.1 


Microsvasular Dermal EC 
TNFaIpha+ IL-lbeta 


13.9 


Primary Thl rest 


41.2 


Bronchial epithelium 

TNFalpha + ILlbeta 


57.8 


Primary Th2 rest 


53.2 


Small airway epithelium 
none 


7.9 


Primary Trl rest 


88.3 


Small airway epithelium 
TNFa]pha + IL-lbeta 


45.1 


CD45RA CD4 

lymphocyte act 


21.6 


Coronery artery SMC rest 


25.2 


CD45RO CD4 
lymphocyte act 


10.6 


Coronery artery SMC 
TNFaIpha + IL-lbeta 


18.2 


CDS lymphocyte act 


2.8 


Astroc3/tes rest 


8.0 


Secondary CD8 
lymphocyte rest 


7.4 


Astrocytes TNFalpha + IL- 

Ibeta 


12.2 


Secondary CDS 
lymphocyte act 


12.4 


KU-812 (Basophil) rest 


4.4 
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CD4 lymphocyte none 


ILO 


KU-812 (Basophil) 
PMA/ionomydn 


0.0 


2ry Thl/Th2/Trl_anti- 


10.1 


CCDl 106 (Keratinocytes) 
none 


5.7 


LAK cells rest 


33.9 


CCDl 106 (Keratinocytes) 

TTMPalnfia 4- TT -lTv»ta 


31.9 


T Air /»<a,tlc IT O 
LiJ\r\. CeJJS iJU-Z 


11./ 


t-/iver Limiubis 






SI o 


ijUpus> K-iuiicy 


117 
11./ 


JLrArw CeiiS IL»-ZT^lrlN 

gamma 


30.8 


NCI-H292 none 


87.1 


T A T<r sialic TT 94-11 T5i 
JL/\Jv ceils iJ^-Zt^ IjU-Io 


->U-6 




inn ft 


L'Aiv ceiiS 

PMA/ionomycin 


5.8 


NCI-H292 IL-9 


48.3 


NK Cells rest 






4 /.J 


1 WO Way MLR 3 day 


lo.o 


NU-rizyz Ir N gamma 




1 WO Way mLK ^ day 


A A 


Hr Ate none 




Two Way MLR 7 day 


8.6 


HPAEC TNF alpha + II^l 
beta 


5.7 


PBMC rest 


ICC 

15.5 


Lung fibroblast none 


2j.o 


PBMC PWM 


25.0 


Lung fibroblast TNF alpha + 
iL-1 beta 


41.5 


PBMC PHA-L 


9J 


Lung fibroblast IL-4 


7.2 


Ramos (B cell) none 


0.0 


Lung fibroblast IL-9 


2.9 


Ramos (B cell) 
ionomycin 


0.0 


Lung fibroblast IL-1 3 


4.3 


B lymphocytes PWM 


18.7 


Lung fibroblast IFN gamma 


15.2 


B lymphocytes CIMOL 
and IL-4 


19.9 


Dermal fibroblast CCDl 070 
rest 


14.9 


EOL-1 dbcAMP 


32.5 


Dermal fibroblast CCDl 070 

TNF alpha 


82.9 


EOL-1 dbcAMP 
PMA/ionomycin 


16.5 


Dermal fibroblast CCD1070 
iJL-i oeta. 


10.1 


Dendritic cells none 


2.5 


i^rmai iiDrooiasi JriN 
gamma 


21.8 


Dendritic cells LPS 


44.4 


Dermal fibroblast IL-4 


10.7 


Dendritic cells anti- 
CD40 


0.0 


IBD Colitis 2 


2.3 


Monocytes rest 


0.0 


IBD Crohn^s 


7.3 


Monocytes LPS 


5.9 


Colon 


4.2 


Macrophages rest 


3.3 


Lung 


13.9 


Macrophages LPS 


42.3 


Thymus 


55.5 


HUVEC none 


0.0 


Kidney 


53.6 


HU VEC starved 


4.2 







Table VE . Panel CNS_1 



Tissue Name 


Rel. Exp.(%) Ag3148, Run 
171694168 


Tissue Name 


Rel. Exp.(%) Ag3148, Run 
171694168 
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BA4 Control 


39.5 


BA17PSP 


7.7 


BA4 Contro}2 


55.1 


BA17 PSP2 


5.3 


BA4 

Al2heimer's2 


7.9 


Sub Nigra Control 


29.5 


B A4 Parkinson's 


36.9 


Sub Nigra Contro12 


35.4 


BA4 ParkJnson's2 


55.5 


Sub Nigra 
Aizheimer's2 


24.1 


BA4 Huntington's 


43.2 


Sub Nigra 
Parkinson's2 


70,7 


BA4 

nuiiuii^iuii I>Z^ 


4.8 


Sub Nigra 


75.8 


BA4 PSP 


9.3 


Sub Nigra 

jnuniingion sz. 


33.9 


BA4 PSP2 


19.9 


Sub Nigra PSP2 


9.2 


BA4 Depression 


19.2 


Sub Nigra Depression 


7.7 


BA4 Depression2 


7.5 


Sub Nigra 
Depression2 


1.2 


BA7 Control 


31.0 


Glob Palladus Control 


19.2 


BA7 Control2 


47.0 


Glob Palladus 

ControI2 


13.7 


BA7 

Alzheimer's2 


3.2 


Glob Palladus 
Alzheimer's 


37.6 


BAT Parkinson's 


15.8 


Glob Palladus 
Alzheinier's2 


6,6 


BA7 Parkinson*s2 


41.2 


Glob Palladus 
Parkinson's 


45.4 


BA7 Huntington's 


36.3 


Glob Palladus 
Parkinson's2 


17.9 


BA7 

JTZUIiUIi^tUi J 


13.5 


Glob Palladus PSP 


5.7 


BA7PSP 


18.6 


Glob Palladus PSP2 


153 


BA7 PSP2 


29.7 


Glob Palladus 
Depression 


6.5 


BA7 Depression 


5.3 


Temp Pole Control 


4.6 


BA9 Control 


27.4 


Temp Pole Control2 


42.6 


BA9 Control2 


87.1 


Temp Pole 
Alzheimer's 


2.5 


BA9 Alzheimer's 


3.6 


Temp Pole 
Alrfieimer's2 


0.0 


BA9 

Al2heinier's2 


18.9 


Temp Pole Parkinson's 


26.1 


BA9 Parkinson's 


22.1 


Temp Pole 
Parkinson's2 


21.8 


BA9 Parkinson*s2 


54.0 


Temp Pole 
Huntington's 


40.1 


BA9 Huntington's 


54.3 


Temp Pole PSP 


2.5 


BA9 

Huntington's2 


25.7 


Temp Pole PSP2 


1.4 
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B A9 PSP 


10.4 


Temp Pole 
Depression2 


4.3 


r>J\y r or Z 


A ft 


\_^jng vjyr i^onxroi 




BA9 Depression 


14./ 


v^^ing vjyr coniroiz 




BA9 Depression2 


5.7 


Cing Gyr Alzheimer's 


33.7 


BAl 7 Control 


40.1 


CingGyr 
Alzheimer's2 


8.9 


BA17 ControI2 


57.8 


Cing Gyr Parkinson's 


41.2 


BA17 

A]zheimer*s2 


4.1 


Cing Gyr Parkinson's2 


44,8 


BAl 7 Parkinson's 


38.2 


Cing Gyr Huntington's 


100.0 


BA17 

Parkinson*s2 


31.0 


Cing Gyr 
Hiintin^on's2 


22.4 


BAl 7 

Huntington's 


30.8 


Cing Gyr PSP 


6.8 


BA17 

Huntington*s2 


5.2 


Cing GyrPSP2 


6.1 


BA17 Depression 


7.8 


Cing Gyr Depression 


2.1 


BA17 

Depression2 


20.2 


Cing Gyr Depression2 


4.4 



CNS_neurodegeiieration_yl.O Summary: Ag3148 The NOV34a gene is found to be 
down-regulated approximately 2-fold in the temporal cortex of Alzheimer's disease patients 
when compared to normal controls (p = 0.015 analysis by ANCOVA). Multiple research 
groups have shown ornithine decarboxylase to be upregulated in the AD brain; the 
downregulation of this form suggests a shift between polyamine biosynthesis pathways 
during neurodegeneration. The polyamine system has also been implicated in seizure, stroke, 
depression and schizophrenia; therefore this gene is an excellent drug target for any of the 
above disorders. 

References: 

Bernstein HG, MuUer M. The cellular localization of the L-omithine 
decarboxylase/polyamine system in normal and diseased central nervous systems. Prog 
Neurobiol 1999 Apr;57(5):485-505 

Natural polyamines, spermidine and spermine, and their precursor putrescine, are of 
considerable importance for the developing and mature nervous system. They exhibit a 
number of neurophysiological and metabolic effects in the nervous system, including control 
of nucleic acid and protein synthesis, modulation of ionic channels and calcium-dependent 
transmitter release. The polyamine system is also known to be involved in various brain 
pathologic events (seizures, stroke, Alzheimer's disease and others). While cerebral 
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polyamine concentrations and the activities of polyamine-metabolizing enzymes have been 
studied in great detail, much less is known about the cells that are responsible for cerebral 
polyamine synthesis and interconversion. With the present review the attempt is made to 
show how exact knowledge about the regional distribution and cellular localization of 
polyamines and the polyamine-synthesizing enzymatic machinery (and especially of L- 
omithine decarboxylase) may help to better understand the functional interplay between 
polyamines and other endogenous agents (transmitters, receptors, growth factors neuroactive 
drugs etc.). Polyamines have been localized both in neurones and glial cells. However, the 
main cellular locus of the ODC is the neuron— both in the immature and adult central nervous 
system. Each period of normal brain development and ageing seems to have its own, 
characteristic temporo-spatial pattern of neuronal ODC expression. During strong functional 
activation (kindling, epileptic seizures, neural transplantation) astrocytes and other non- 
neuronal cells do also express ODC and other polyamine-metabolizing enzymes. Astroglial 
expression of ODC is accompanied by an increase in glial fibrillary acidic protein in these 
cells. This shift in the cellular mechanisms of polyamine metabolism is currently far from 
being understood. In human brain diseases (Alzheimer's disease, schizophrenia) certain 
neurones show an increased expression of ODC, the first and rate-limiting enzyme of 
polyamine metabolism. Since polyamines are structurally related to psychoactive drugs 
(neuroleptics, antidepressants) the polyamine system might be of importance as a putative 
target for drug intervention in psychiatry. 

Morrison LD, Cao XC, Kish SJ. Ornithine decarboxylase in human brain: influence of 
aging, regional distribution, and Alzheimer's disease. J Neurochem 1 998 Jul;71(l):288-94 

Although experimental animal data have implicated ornithine decarboxylase, a key 
regulatory enzyme of polyamine biosynthesis, in brain development and function, little 
information is available on this enzyme in normal or abnormal human brain. We examined 
the influence, in autopsied human brain, of postnatal development and aging, regional 
distribution, and Alzheimer's disease on the activity of ornithine decarboxylase. Consistent 
with animal data, human brain ornithine decarboxylase activity was highest in the perinatal 
period, declining sharply (by approximately 60%) during the first year of life to values that 
remained generally unchanged up to senescence. In adult brain, a moderately heterogeneous 
regional distribution of enzyme activity was observed, with high levels in the thalamus and 
occipital cortex and low levels in cerebellar cortex and putamen. In the Alzheimer's disease 
group, mean ornithine decarboxylase activity was significantly increased in the temporal 
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cortex (+76%), reduced in occipital cortex (-70%), and unchanged Jn hippocampus and 
putamen. In contrast, brain enzyme activity was normal in patients with the 
neurodegenerative disorder spinocerebellar ataxia type L Our demonstration of ornithine 
decarboxylase activity in neonatal and adult human brain suggests roles for ornithine 
decarboxylase in both developing and mature brain function, and we provide further evidence 
for the involvement of abnormal polyamine system activity in Alzheimer's disease. 

Panel 13D Sammary: Ag3148 Highly brain-preferential expression of the NOV34a gene 
indicates a specific role for this gene in the CNS. Polyamine synthesis by ornithine 
decarboxylase is thought to play a neuroprotective role or recovery role, or both, after 
transient focal ischemia in the CNS, Therefore, agents that enhance the activity of this gene 
product are likely to have medical utility as therapeutics for the treatment of stroke and 
trauma. Other diseases that involve oxidative damage, such as neurodegenerative diseases 
like Alzheimer's disease, also involve defensive mechanisms in which ornithine 
decarboxylase plays a role. Therefore, agents that enhance the activity of this gene are likely 
to have medical utility as therapeutics for the treatment of neurodegenerative diseases such as 
Alzheimer's disease. 

In addition, significant levels of expression are seen in brain and liver cancer cell 
lines. Thus, expression of this gene could be used to diflferentiate between these samples and 
other samples on this panel and as a marker to detect the presence of these cancer. 
Furthermore, therapeutic modulation of the expression or function of this gene may be 
effective in the treatment of brain and liver cancer. 

References: 

Yatin SM, Yatin M, Aulick T, Ain KB, Butterfield DA. Alzheimer's amyloid beta- 
peptide associated free radicals increase rat embryonic neuronal polyamine uptake and 
ornithine decarboxylase activity: protective effect of vitamin E. Neurosci Lett 1999 Mar 
19;263(l):17-20 

Recent evidence indicates that alterations in brain polyamine metabolism may be 
critical for nerve cell survival after a free radical initiated neurodegenerative process. It has 
been shown previously that A beta(l-42) and A beta(25-35) are toxic to neurons through a 
free radical dependent oxidative mechanism. Treatment of rat embryonic hippocampal 
neuronal cultures with A beta-peptides increased ornithine decarboxylase (ODC) activity and 
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spermidine uptake, suggesting that oxidative stress upregulates the polyamine mechanism for 
the repair of free radical damage. Pretreatment of the cells with vitamin E prior to A beta 
exposure decreased ODC activity and spermidine uptake to control level. TTiis study is the 
first to demonstrate that A beta treated cells show an increased polyamine metabolism in 
response to free radical mediated oxidative stress and that the free radical scavenger vitamin 
E prevents these attenuations. These results are discussed with reference to Alzheimer's 
disease. 

Kaasinen K, Koistinaho J, Alhonen L, Janne J. Overexpression of 
spermidine/spermine N-acetyltransferase in transgenic mice protects the animals from 
kainate^induced toxicity. Eur J Neurosci 2000 Feb;12(2):540-8 

We recently generated a transgenic mouse line with activated polyamine catabolism 
through overexpression of spermidine/spermine Nl-acetyltransferase (SSAT). A detailed 
analysis of brain polyamine concentrations indicated that all brain regions of these animals 
showed distinct signs of activated polyamine catabolism, e.g. overaccumulation of putrescine 
(three- to 17-fold), appearance of Nl-acetylspermidine and decreases in spermidine 
concentrations. In situ hybridization analyses revealed a marked overexpression of SSAT- 
specific mRNA all over the brain tissue of the transgenic animals. The transgenic animals 
appeared to tolerate subcutaneous injections of high-dose kainate substantially better as their 
overall mortality was less than 50% of that of their syngenic littermates. We used the 
expression of glial fibrillary acidic protein (GFAP) as a marker of brain injury in response to 
kainate. In situ hybridization analysis with GFAP oligonucleotide up to 7 days after the 
administration of sublethal kainate doses showed reduced GFAP expression in transgenic 
animals in comparison with their non-transgenic littermates. This difference was especially 
striking in the cerebral cortex of the transgenic mice where the exposure to kainate hardly 
induced GFAP expression. The treatment with kainate likewise resulted in loss of the 
hippocampal (CAS) neurons in non-transgenic but not transgenic animals. These results 
support our earlier findings indicating that elevated concentrations of brain putrescine^ 
irreqjective whether derived from an overexpression of ornithine decarboxylase, or as shown 
here, from an overexpression of SSAT, play in all likelihood a neuroprotective role in brain 
injury. 

Kilpelainen P, Rybnikova E, Hietala O, Pelto-Huikko M. Expression of ODC and its 
regulatory protein antizyme in the adult rat brain. J Neurosci Res 2000 Dec 1 ;62(5):675-85 
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Ornithine decarboxylase and its inhibitor protein, antizyme are key regulators of 
polyamine biosynthesis. We examined their expression in the adult rat brain using in situ 
hybridization and immunocytochemistry. Both genes were widely expressed and their 
expression patterns were mostly overlapping and relatively similar. TTie levels of antizjnme 
mRNA were always higher than those of ornithine decarboxylase mRNA. The highest 
expression for both genes was detected in the cerebellar cortex, hippocampus^ hypothalamic 
paraventricular and supraoptic nuclei, locus coeruleus, olfactory bulb, piriform cortex and 
pontine nuclei. Ornithine decarboxylase and antizyme mRNAs appeared to be localized in the 
nerve cells, ODC antibody displayed mainly cytoplasmic staining in all brain areas. Antizyme 
antibody staining was mainly cytoplasmic in the most brain areas, although predominantly 
nuclear staining was detected in some areas, most notably in the cerebellar cortex, anterior 
olfactory nucleus and frontal cortex. Our study is the first detailed and comparative analysis 
of ornithine decarboxylase and antizyme expression in the adult mammalian brain. 

Raghavendra Rao VL, Dogan A, Bowen KK, Dempsey RJ. Ornithine decarboxylase 
knockdown exacerbates transient focal cerebral ischemia-induced neuronal damage in rat 
brain. J Cereb Blood Flow Metab 2001 Aug;21(8):945-54 

Transient cerebral ischemia leads to increased expression of ornithine decarboxylase 
(ODC). Contradicting studies attributed neuroprotective and neurotoxic roles to ODC after 
ischemia. Using antisense oligonucleotides (ODNs), the current study evaluated the 
functional role of ODC in the process of neuronal damage after transient focal cerebral 
ischemia induced by middle cerebral artery occlusion (MCAO) in spontaneously 
hypertensive rats. Transient MCAO significantly increased the ODC immunoreactive protein 
levels and catalytic activity in the ipsilateral cortex, which were completely prevented by the 
infusion of antisense ODN specific for ODC. Transient MCAO in rats infused with ODC 
antisense ODN increased the infarct volume, motor deficits, and mortality compared with the 
sense or random ODN-infused controls. Results of the current study support a 
neuroprotective or recovery role, or both, for ODC after transient focal ischemia. 

Farooqui AA, Yi Ong W, Lu XR, HalHwell B, Horrocks LA. Neurochemical 
consequences of kainate-induced toxicity in brain: involvement of arachidonic acid release 
and prevention of toxicity by phospholipase A(2) inhibitors. Brain Res Brain Res Rev 2001 
Dec;38(l-2):61-78 
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In kainate-induced neurotoxicity, the stimulation of kainate receptors results in the 
activation of phospholipase A(2) and a rapid release of arachidonic acid from neural 
membrane glycerophospholipids. This process raises arachidonic acid levels and produces 
alterations in membrane fluidity and permeability. These result in calcium influx and 
stimulation of lipolysis and proteolysis, production of lipid peroxides, depletion of ATP, and 
loss of reduced glutathione. As well as the above neurochemical changes, stimulation of 
ornithine decarboxylase, altered activities of protein kinase C isozymes, and expression of 
immediate early genes, cytokines, growth factors, and heat shock proteins have also been 
reported. Kainate-induced stimulation of arachidonic acid release, calcium influx, 
accumulation of lipid peroxides and products of their decomposition, especially 4- 
hydroxynonenal (4-HNE), along with alterations in cellular redox state and ATP depletion 
may play important roles in kainate-induced cell death. Thus the consequences of altered 
glycerophospholipid metabolism in kainate-induced neurotoxicity can lead to cell death. 
Kainate*induced neurotoxicity initiates apoptotic as well as necrotic cell death depending 
upon the intensity of oxidative stress and abnormality in mitochondrial function. Other 
neurochemical changes may be related to synaptic reorganization following kainate-induced 
seizures and may be involved in recapitulation of hippocampal development and 
synaptogenesis. 

Panel 4D Summary: Ag3148: The NOV34a transcript is expressed in activated and 
differentiated T Cells, LPS activated macrophages and dendritic cells. In addition, TNF alpha 
appears to induce expression in epithelial cells, keratinocytes, and fibroblasts. Blocking of 
ornithine decarboxylase by drugs has shown to block respiratory burst in response to specific 
stimuli (see reference). Therefore, therapeutics designed with the protein encoded by this 
transcript may alter activation of PMNs and macrophages and be important in the treatment 
of inflammatory diseases such as inflammatory bowel disease, asthma, arthritis and psoriasis. 

References: 

Walters JD, Cario AC, Danne MM, Marucha PT. An inhibitor of ornithine 
decarboxylase antagonizes superoxide generation by primed human polymorphonuclear 
leukocytes. J Inflamm 1998;48(l):40-6 

Tumor necrosis factor-alpha (TNF-alpha) induces a rapid increase in 
polymorphonuclear leukocyte (PMN) polyamine content which appears to be required for 
optimal priming of the respiratory burst. The objective of the present study was to determine 
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whether inhibition of polyamine biosynthesis modifies PMN responses to lipopolysaccharide 
(LPS), granulocyte-macrophage colony-stimulating factor (GM-CSF), or granulocyte colony- 
stimulating factor (G-CSF). Treatment with alpha-difluoromethylomithine (DFMO), a 
selective inhibitor of the rate-limiting biosynthetic enzyme ornithine decarboxylase, produced 
dose-dependent inhibition of the respiratory burst in PMNs that were primed by these agents 
and subsequently activated by formyl-Met-Leu-Phe (fMLP). However, DFMO did not 
signiJRcantly inhibit fMLP-stimulated superoxide generation or alter the induction of PMN 
adhesion and interleukin-1 beta (IL-1 beta) mRNA expression by LPS or GM-CSF. 
Antagonism of priming by DFMO correlated with a dose-dependent attenuation of fMLP- 
induced intracellular Ca2+ mobilization (r > or = 0.96). Since Ca2+ plays an important role in 
modulating the respiratory burst in primed PMNs, this could, in part, account for the selective 
effects of DFMO. 

PMID: 9368191 

Kaczmarek L, Kaminska B, Messina L, Spampinato G, Arcidiacono A, Malaguamera 
L, Messina AJnhibitors of polyamine biosynthesis block tumor necrosis factor-induced 
activation of macrophages. Cancer Res 1992 Apr I ;52(7):1891-4 

The activation of polyamine biosynthesis, dependent on increased gene expression of 
ornithine decarboxylase, has been found to play an important role in the control of cell 
proliferation and differentiation. In this report it has been found that accumulation of 
ornithine decarboxylase mRNA also follows stimulation of human monocytes/macrophages 
by tumor necrosis factor. Human recombinant tumor necrosis factor (100 units/ml) also 
evoked an enhanced respiratory burst of macrophages. The respiratory burst response was 
inhibited in a dose-dependent manner with difluoromethylomithine, an inhibitor of ornithine 
decarboxylase, and methyIglyoxai-bis(guanylhydrazone), an inhibitor of the formation of 
spermidine and spermine. The data presented in this paper suggest that polyamines may play 
a functional role in tumor necrosis factor-driven macrophage activation, and they are 
discussed in the context of their possible use as inhibitors of polyamine metabolism in tumor 
chemotherapy. 

PMID: 1312903 
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Panel CNS_1 Summary: Ag3148 This panel confmns the expression of Ae NOV34a gene 
in the CNS. See panel CNS_Neurodegeneration for a discussion of utility of this gene in the 
central nervous system. 

W. NOV35 - CG57339-01: short chain dehydrogenase/reductase-like protein 

Expression of the NOV35 gene was assessed using the primer-probe set Ag3203, 
described in Table WA. Results of the RTQ-PCR runs are shown in Tables WB, WC and 
WD. 



Table WA . Probe Name Ag3203 



Primers 


Sequences 


Length 


Start Position 


SEQID 
NO: 


Forward 


5-tccttcccactagacaacttga-3* 


22 


236 


539 


Probe 


TET-5-tcctagcctatagctactcttccgttcca-3'-TAMRA 


29 


268 


540 


Reverse 


5-atcagagcaggaaaccaagaag-3' 


22 


297 


541 



Table WB . CNS_neurodegeneration_vl.O 



Tissue Name 


Rel. Exp.(%) Ag3203, Run 
209861768 


Tissue Name 


Rel. Exp.(%) Ag3203, Run 
209861768 


AD 1 Hippo 


35.1 


Control (Path) 3 
Temporal Ctx 


3.4 


AD 2 Hippo 


48,6 


Control (Path) 4 
Temporal Ctx 


72.7 


AD 3 Hippo 


18.2 


AD 1 Occipital Ct?c 


35.4 


AD 4 Hippo 


15.6 


AD 2 Occipital Ctx 

(Missing) 


0.0 


AD 5 Hippo 


96.6 


AD 3 Occipital Ctx 


32.3 


AD 6 Hippo 


65.1 


AD 4 Occipital Ctx 


22.7 


Control 2 Hippo 


33.9 


AD 5 Occipital Ctx 


63.7 


Control 4 Hippo 


57.4 


AD 6 Occipital Ctx 


21.8 


Control (Path) 3 
Hippo 


9,6 


Control 1 Occipital 
Ctx 


0.0 


AD 1 Temporal Ctx 


34.9 


Control 2 Occipital 
Ctx 


66.9 


AD 2 Temporal Ctx 


42.0 


Control 3 Occipital 
Ctx 


71.2 


AD 3 Temporal Ctx 


16.0 


Control 4 Occipital 
Ctx 


9.5 


AD 4 Temporal Ctx 


34.2 


Control (Path) 1 
Occipital Ctx 


100.0 


AD 5 Inf Temporal 
Ctx 


77.4 


Control (Path) 2 
Occipital Ctx 


23.0 


AD 5 Sup Temporal 
Ctx 


58.6 


Control (Path) 3 
Occipital Ctx 


3.4 
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AD 6 Inf Temporal 

Ctx 


71.7 


Control (Path) 4 
Occipital Ctx 


52.9 


AD 6 Sup Temporal 
Ctx 


70.2 


Control 1 Parietal Ctx 


16.6 


Control 1 Temporal 
Ctx 


15.0 


Control 2 Parietal Ctx 


84.7 


Control 2 Temporal 
Ctx 


46.0 


Control 3 Parietal Ctx 


50.0 


Control 3 Temporal 
Ctx 


49.3 


Controi (rath) 1 
Parietal Ctx 


56.6 


Control 3 Temporal 
Ctx 


13.1 


Control (Path) 2 
Parietal Ctx 


57.0 


Control (Path) 1 
Temporal Ctx 


45.1 


Control (Path) 3 
Parietal Ctx 


5.6 


Control (Path) 2 
Temporal Ctx 


45.1 


Control (Path) 4 
Parietal Ctx 


46.7 



Table WC. Panel 1.3D 



Tissue Name 


Rel. Exp.(%) Ag3203, 
Run 167994666 


Tissue Name 


Rel. Exp.(%)Ag3203, 
Run 167994666 


Liver adenocarcinoma 


0.0 


Kidney (fetal) 


100.0 


Pancreas 


0.0 


Renal ca. 786-0 


3.4 


Pancreatic ca. CAPAN 2 


0.0 


Renal ca. A498 


3.2 


Adrenal gland 


1.3 


Renal ca. RXF 393 


3.2 


Thyroid 


9.4 


Renal ca. ACHN 


3.0 


Salivary gland 


4.6 


Renal ca.UO-31 


14.6 


Pituitary gland 


29.7 


Renal ca.TK-10 


6.9 


Brain (fetal) 


47.6 


Liver 


5.8 


Brain (whole) 


55.5 


Liver (fetal) 


6.7 


Brain (amygdala) 


30.1 


Liver ca. (hepatoblast) 
HepG2 


3.7 


Brain (cerebellum) 


31.6 


Lung 


12.1 


Brain (hippocampus) 


26.6 


Lung (fetal) 


53.6 


Brain (substantia nigra) 


44.8 


Lung ca. (small cell) 
LX-1 


5.3 


Brain (thalamus) 


44.8 


Lung ca. (small cell) 
NCI-H69 


2.9 


Cerebral Cortex 


45.7 


Lung ca. (s.cell var.) 
SHP-77 


493 


Spinal cord 


31.0 


Lung ca. (large 
cell)NCI-H460 


0.0 


glio/astroU87-MG 


12.9 


Lung ca. (non-sm. cell) 
A549 


41,5 


glio/astroU-118-MG 


5.8 


Lung ca. (non-s.cell) 
NCI-H23 


32.5 


astrocytoma SW1783 


21.9 


Lung ca. (non-s.cell) 
HOP-62 


7.5 
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neuro*; met SK-N-AS 


0.0 


Lung ca. (non-s.cl) 
NCI-H522 


24.0 


astrocytoma SF-539 


1.5 


Lung ca. (squam.) SW 
900 


15.6 


astrocytoma SNB-75 


12.0 


Lung ca. (squam.) 
NCI-H596 


11.9 


glioma SNB-1 9 


1.3 


Mammary gland 


37.1 


glioma U251 


15.8 


Breast ca.* (pl.ef) 
ivii.^r- / 


16.7 


glioma SF-295 


17.8 


oreasi ca. v^pi.ei^ 
MDA-MB-231 


1.9 


Heart (fetal) 


14.2 


Breast ca.* (pl.ef) 
T47D 


31.4 


Heart 


0.8 


Breast ca. BT-549 


6.6 


Skeletal muscle (fetal) 


15.9 


Breast ca. MDA-N 


0.0 




10.2 


Ovary 


39.8 


J->U11C' llJtuiJv/W 




Ovarian ca OVCAR-3 


10.2 


I iiy iiiuj 




Ovarian ca. OVCAR-4 


3.0 


Spleen 


37A 


Ovarian ca. OVCAR-5 


55.1 


Lymph node 


18.9 


Ovarian ca. OVCAR-8 


30.1 


Colorectal 


2.5 


Ovarian ca IGROV-1 


3.1 


Stomach 


17.6 


Ovarian ca* (ascites) 
SK-OV-3 


13.4 


Small intestine 


4.2 


Uterus 126.8 


ColoiTcarSwisO 


6.9 


Placenta jO.O 


Colon ca.* 

b Wo20(c> W4oU met) 


39.0 


^-^ 1 ' 

Prostate |l5.2 

—T-^^T .,.,TT""f~rr* -n innTri ¥-YrT— nrfffrr ' TiflTn i 


Colon ca. HT29 


1.4 


Prostate ca* (bone Iq ^ 
met)PC-3 \ 


Colon ca. HC i - i i o 


6A 


Testis i95.3 


Colon ca. CaCo-2 


0.0 


Melanoma Hs688(A).T16.5 


Colon ca. 


2.5 


Melanoma* (met) u^n 
Hs688(B).T 1 


Colon ca. HCC-2998 


5.3 


Melanoma UACC-62 


2.7 


Gastric ca.* (liver met) 
NCI-N87 


15.0 


Melanoma M14 


0.0 


Bladder 


9.0 


Melanoma LOX IM VI iO.O 


Trachea 


12.3 


Melanoma* (met) SK- -L - 
MEL-5 


Kidney 


17.8 


Adipose jO.O 



Table WD. Panel 4D 



Tissue Name 


ReLExp.(%)Ag3203, 
Run 164389445 


Tissue Name 


Rel. Exp.(%)Ag3203, 
Run 164389445 


Secondary Thl act 


0.0 


HUVEC IL-lbeta 


9.2 


Secondary Th2 act 


10.2 


HUVEC IFN gamma 


6.5 


Secondary Trl act 


6.2 


HUVEC TNF alpha + IFN 


0.0 
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gamma 




Secondary Thl rest 


4.4 


HUVEC TNF alpha + IL4 


0.0 


Secondary Th2 rest 






0.5 


Secondary Tri rest 


16.4 


Lung Microvascular EC none 


0.0 


Primary Thl act 


6.4 


Lung Microvascular EC 
TNFalpha + IL-lbeta 


0.0 


Primary Th2 act 


0.0 


Microvascular Dermal EC 
none 


0.0 


Primary Trl act 


0.0 


Microsvasular Dermal EC 
TNFalpha + lL-lbeta 


0.0 


Primary Thl rest 


71.7 


Bronchial epithelium 
TNFalpha-MLlbeta 


13.3 


Primary Th2 rest 


8.6 


Small airway epithelium 
none 


0.0 


Primary Trl rest 


26.1 


Small airway epithelium 
TNF alpha IL-1 beta 


12.7 


CD45RA cm 
lymphocyte act 


13.6 


Coronery artery SMC rest 


23.5 


CD45RO CD4 
lymphocyte act 


0.0 


Coronery artery SMC 
TNFalpha + lL-1 beta 


0.0 


CDS lymphocyte act 


12.2 


Astrocytes rest 


0.0 


Secondary CDS 
lymphocyte rest 


12.2 


Astrocytes TOFalpha + IL- 
Ibeta 


0.0 


Secondary CDS 
lymphocyte act 


0.0 


KU-812 (Basophil) rest 


7.7 


CD4 lymphocyte none 


8.2 


KU-8 1 2 (Basophil) 
PMA/ionomycin 


15.9 


2ry rnl/Tn2/Tri anti- 
CD95 CHI 1 


19.2 


CCDl 106 (Keratmocytes) 
none 


33.7 


LAK cells rest 


22.2 


CCDl 106 (Keratinocytes) 
TNFalpha + IL-1 beta 


7.3 


LAK cells IL-2 


27.7 


T ,iver cirrho^i^ 


36.3 


LAK cells IL-2+IL-12 


0.0 


I .iinus kidnev 


5.0 


LAK cells 1L-2+IFN 
gamma 


19.1 


NCI-H292 none 


31.0 


LAK cells IL-2-I- IL-18 


0.0 


NCI-H292 IL-4 


14.1 


LAK cells 
PMA/ionomycin 


0.0 


NCI-H292 IL-9 


13.2 


NK Cells lL-2 rest 


18.9 


NCI~H292IL-13 


3.3 


Two Way MLR 3 day 


10.1 \ 


NCI-H292 IFN gamma 


6.3 


Two Way MLR 5 day 


9.1 


HPAEC none 


13.3 


Two Way MLR 7 day 


9.3 


HPAEC TNF alpha + IL-l 
beta 


0.0 


PBMC rest 


0.0 


Lung fibroblast none 


55.9 


PBMCPWM 


36.9 


Lung fibroblast TNF alpha + 
IL-1 beta 


49.0 


PBMCPHA-L 


9.2 


Lung fibroblast IL-4 


44.1 
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Ramos (B cell) none 


18.3 


|Lung fibroblast IL-9 143.5 


Ramos (B cell) 
ionomycin 


483 


iLung fibroblast IL-13 


49.0 


B lymphocytes PWM 


75.8 


iLung fibroblast IFN gamma 


A 

D / .U 


13 lyinpnovyica \^l/^\jxj 
andIL-4 


7L2 


jDermal fibroblast CCD1070 

(rest 1 


23.7 


EOL-1 dbcAMP 


0.0 


IDermal fibroblast CCD1070 
|TNF alpha 


60.7 


EOL-1 dbcAMP 
PMA/ionomycin 


0.0 


jDermal fibroblast CCD1070 
jlL-1 beta 


54.0 




6.1 




Dermal fibroblast IFN 

gamma 


OJ.l 


Dendritic cells LPS 


0.0 


: 


Dermal fibroblast IL-4 


52.1 


Dendritic cells anti- 
L/1/4U 


4.9 


IBD Colitis 2 


0.0 


Monocytes rest 


13.1 


IBD Crohn's 


0.0 


Monocytes LPS 


6.6 


Colon 


100.0 


Macrophages rest 


0.0 


Lung 


145.7 


Macrophages LPS 


12.8 


Thymus 


|98.6 


HUVEC none 


14.0 


Kidney 


|47.6 


HUVEC starved 


0.0 







CNS_neurodegeiieratioii_vl.O Summary: Ag3203 The NOV35 gene is not found to be 
difFerentially regulated in Alzheimer's disease; however a close homolog of this gene has 



been shown to mediate neurotoxicity via amyloid beta binding. Therefore, the NOV35 gene 
may be an excellent drag target for the treatment of Alzheimer's disease, specifically for 
blocking amyloid beta induced neuronal death. 

References: 

He XY, Schulz H, Yang SY.A human brain L-3-hydroxyacyl-coenzyme A 
dehydrogenase is identical to an amyloid beta-peptide-binding protein involved in 
Alzheimer's disease. J Biol Chem 1998 Apr 24;273(17):10741-6 

A novel L-3-hydroxyacyl-CoA dehydrogenase from human brain has been cloned, 
expressed, purified, and characterized. This enzyme is a homotetramer with a molecular mass 
of 108 kDa. Its subunit consists of 261 amino acid residues and has structural features 
characteristic of short chain dehydrogenases. It was found that the amino acid sequence of 
this human brain enzyme is identical to that of an endoplasmic reticulum amyloid beta- 
peptide-binding protein (ERAB), which mediates neurotoxicity in Alzheimer's disease (Yan, 
S. D., Fu, J., Soto, C, Chen, X., Zhu, H., Al-Mohanna, F., Collison, K., Zhu, A., Stem, E., 
Saido, T., Tohyama, M., Ogawa, S., Roher, A., and Stem, D. (1997) Nature 389, 689-695). 
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The purification of human brain short chain L«3-hydroxyacyl-CoA dehydrogenase made it 
possible to characterize the structural and catalytic properties of ERAB^ This NAIH- 
dependent dehydrogenase catalyzes the reversible oxidation of L-3-hydroxyacyl-CoAs to 
form 3-ketoacyl-CoAs, but it does not act on the D-isomers. The catalytic rate constant of the 
purified enzyme was estimated to be 37 s-1 with apparent Km values of 89 and 20 &mgr;M 
for acetoacetyl-CoA and NADH, respectively. The activity ratio of this enzyme for substrates 
with chain lengths of C4, C8, and CI 6 was approximately 1:2:2. The human short chain L-3- 
hydroxyacyl-CoA dehydrogenase gene is organized into six exons and five introns and maps 
to chromosome Xpl 1 .2. The amino-terminal NAD-binding region of the dehydrogenase is 
encoded by the first three exons, whereas the other exons code for the carboxyl-terminal 
substrate-binding region harboring putative catalytic residues. The results of this study lead to 
the conclusion that ERAB involved in neuronal dysfunction is encoded by the human short 
chain L-3-hydroxyacyl-CoA dehydrogenase gene 

Panel 13D Summary: Ag3203 Highest expression of the NOV35 gene is seen in the fetal 
kidney (CT=32). In addition, significant levels of expression are also seen in cell luies 
derived fi-om ovarian, lung and colon cancers. Thus, expression of this gene could be used to 
differentiate between these samples and other samples on this panel and as a marker to detect 
the presence of these cancers. Furthermore, therapeutic modulation of the expression or 
function of this gene may be effective in the treatment of ovarian, lung and colon cancers. 

Among metabolic tissues, the NOV35 gene has a low level of expression in the 
pituitary. Therefore, this gene product may be a small molecule target for the treatment of 
diseases of the pituitary, including pituitary adenomas and multiple endocrine neoplasia. 

In addition, expression in the brain confirms the expression of this gene in the CNS. 
See panel CNSJNeurodegeneration for a discussion of utility of this gene in the central 
nervous system. 

Panel 4D Summary: Ag3203 The expression of the NOV35 transcript is highest in colon 
and thymus. This gene is also expressed in fibroblasts, B cells and Thl cells. Thus, the 
transcript or the protein it encodes could be used as a marker for these tissues. Additionally, 
therapeutics designed with the transcript encoded by this protein could be used for 
maintaining normal homeostasis in the colon and thymus. 

X- NOV36 - CG57341-01: Short Chain dehydrogenase/reductase 
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Expression of the NOV36 gene was assessed using the primer-probe set Ag3204, 
described in Table XA. Results of the RTQ-PCR runs are shown in Tables XB, XC and XD. 



Table XA . Probe Name Ag3204 



Primers 


Sequences 


Length 


Start Position 


SEQID 
NO: 


Forward 


5-ggactttgatcccctacagatg-3* 


22 


170 


542 


Probe 


TET-5*-tcaaatgaagaggacatcctctccat-3-TAMRA 


26 


199 


543 


Reverse 


5-ctgagaacggatagctgagaac-3* 


22 


225 


544 



Table XB . CNS_neurodegeneration_vLO 



Tissue Name 


Rel. Exp.(%) Ag3204, Run 
209861769 


1 issue iName 


Rel. Exp.(%) Ag3204, Run 
209861769 


AD 1 Hippo 


3.9 


Control (Path) 3 
Temporal Ctx 


7 
j.t 


AD 2 Hippo 




Control (Path) 4 
Temporal Ctx 




AD 3 Hippo 


33 


AD 1 Occipital Ctx 


12 


AD 4 Hippo 


3.9 


AD 2 Occipital Ctx 

(Missing) 


0.0 


AD 5 hippo 


67.4 


AD 3 Occipital Ctx 


3.9 


AT) f^ Hinno 


20.9 


AD 4 Occipital Ctx 


14.7 


Control 2 Hippo 


24.7 


AD 5 Occipital Ctx 


16.8 


Control 4 Hippo 






64 2 


Control (Path) 3 Hippo 


3.9 


Control 1 Occipital 
Ctx 


5.1 


AD 1 Temporal Ctx 


7.5 


Control 2 Occipital 
Ctx 


100.0 


AD 2 Temporal Ctx 


10.5 


Control 3 Occipital 
Ctx 


18.6 


AD 3 Temporal Ctx 


2.7 


Control 4 Occipital 
Ctx 


5.3 


AD 4 Temporal Ctx 


12.8 


Control (Path) 1 
Occipital Ctx 


78.5 


AD 5 Inf Temporal Ctx 


57.0 


Control (Path) 2 
Occipital Ctx 


12.1 


AD 5 SupTemporal 
Ctx 


22.8 


Control (Path) 3 
Occipital Ctx 


3.3 


AD 6 Inf Temporal Ctx 


26.2 


Control (Path) 4 
Occipital Ctx 


22.5 


AD 6 Sup Temporal 
Ctx 


30.6 


Control 1 Parietal 
Ctx 


7.1 


Control 1 Temporal 

Ctx 


7.7 


Control 2 Parietal 
Ctx 


29.1 


Control 2 Temporal 
Ctx 


48.0 


Control 3 Parietal 

Ctx 


23.7 
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Control 3 Temporal 
Ctx 


17.1 


Control (Path) 1 
Parietal Ctx 


07 7 


Control 4 Temporal 
Ctx 


4.6 


Control (Path) 2 
Parietal Ctx 


24.5 


Control (Path) 1 
Temporal Ctx 


55.1 


Control (Path) 3 
Parietal Ctx 


5.3 


Control (Path) 2 
Temporal Ctx 


39.5 


Control (Path) 4 
Parietal Ctx 


44.1 



Table XC. Panel 1.3D 



Tissue Name 


Rel. Exp.(%) Ag3204, 
Run 167994669 


iRel.Exp.(%)Ag3204, 
Tissue Name Iruo 1 67994669 


Liver adenocarcinoma 


23.8 


Kidney (fetal) 


23.2 


Pancreas 


9.9 


Renal ca. 786-0 


7.0 


Pancreatic ca. CAPAN 2 


9.6 


Renal ca. A498 


11.7 


Adrenal gland 


3.8 


Renal ca. RXF 393 |8.8 


Thyroid 


10.0 


Renal ca. ACHN 19.7 


Salivary gland 


5.9 


Renal ca.UO-31 |2.6 


Pituitary gland 


10.5 


Renal ca. TK-10 |22.8 


i>rain v^iexai ) 




Liver |7.8 


Brain (whole) 


46.3 


Liver (fetal) |8.2 


Brain (amygdala) 


16.8 


Liver ca. (hepatoblast) i^, ^ 
HepG2 1 ""'^ 


Brain (cerebellum) 


22.7 


Lung 


2.2 


Bram (hippocampus) 


18.9 


Limg (fetal) 


5.6 


Brain (substantia nigra) 


20.3 


Lung ca. (small cell) 
LX-1 


19.1 


Brain (thalamus) 


26.6 


Lung ca. (small cell) 

NCI-H69 


6.1 


Cerebral Cortex 


66.4 


Lung ca. (s.cell var.) 
SHP-77 


75.3 


Spinal cord 


6.9 


Lung ca. (large 
cell)NCI-H460 


1.5 


glio/astro U87-MG 


11.8 


Lung ca. (non-sm. cell) 

A549 


13.4 


glio/astroU-118-MG 


5.0 


Lung ca. (non-s.cell) 
NCI-H23 


5.0 


astrocytoma SW1783 


23.7 


Lung ca. (non-s.cell) 
HOP-62 


10.7 


neuro*; met SK-N-AS 


12.9 


Lung ca. (non-s.cl) 
NCI-H522 


46.3 


astrocytoma SF-539 


5.4 


Lung ca. (squam.) SW 
900 


4.8 


astrocytoma SNB-75 


9.4 


Lung ca. (squam.) 
NCI-H596 


23.5 


glioma SNB-19 


13.9 


Mammary gland 


16.5 


glioma U251 


28.5 iBreast ca.* (pl.ef) 


1.5 
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MCF-7 




glioma SF-295 


21.9 


Breast ca.* (pl.ef) 
MDA-MB-231 


13.0 


Heart (letal} 




Breast ca.* (pl.ef) 
T47D 




Heart 


9.7 


Breast ca. BT-549 


5.1 


Skeletal muscle (fetal) 


133 


Breast ca. MDA-N 


12.0 


Skeletal muscle 


18.8 


Ovary 


8.1 


Bone marrow 


5.2 


Ovarian ca. OVCAR-3 


10.2 


Thymus 


4J 


Ovarian ca OVCAR-4 


15.3 


Spleen 


3.1 


Ovarian ca OVCAR-5 


46.0 




6.5 


Ovarian ca OVCAR-8 


3.5 


Colorectal 


69.7 


Ovarian ca IGROV-1 


4.4 


Stomach 


8.7 


Ovarian ca* (ascites) 
SK-OV-3 


45.4 


Small iTitestinf* 


77.4 


Uterus 


2.6 


Colon ca.SW480 


26.4 


iPlacenta 


0.6 


Colon ca.* 


47.3 |Prostate 


5.8 


Colon ca. HT29 


^ fProstate ca* (bone 
' |met)PC-3 


4.5 




14,0 {Testis 


7 0 


Colon ca. CaCo-2 


100.0 


IMelanoma Hs688(A).T 


0.9 


Colon ca. 
tissue(OD03866) 


4.9 


Melanoma* (met) 
tHs688(B).T 


0.8 


Colon ca. HCC-2998 


28.1 


=Melanoma UACC-62 


13.0 


Gastric ca.* (liver met) 
NCI-N87 


18.6 


Melanoma Ml 4 


3.2 


Bladder 


5.3 


Melanoma LOXIMVI 


18.2 


Trachea 


1.9 


Melanoma* (met) SK- 
MEL-5 


15.6 


Kidney 


27.2 


Adipose 


6.6 



Table XD . Panel 4D 



Tissue Name 


Rel. Exp.(%) Ag3204, 
Run 164389446 


Tissue Name 


Rel. Exp.(%) Ag3204, 
Run 164389446 


Secondary Thl act 


5.0 


HUVEC IL-lbeta 


1.2 


Secondary Th2 act 


3.3 


IWVECmN gamma 


2.0 


Secondary Trl act 


4.0 


HUVEC TNF alpha + IFN 
gamma 


1.0 


Secondary Thl rest 


OJ 


HUVEC TNF alpha + IL4 


2.9 


Secondary Th2 rest 


0.6 


fiuVECII^li 


2.5 


Secondary Trl rest 


0.2 


Lung Microvascular EC none 


2.9 


Primary Thl act 


7.4 


Lung Microvascular EC 
TNFalpha + IL-lbeta 


2.4 


Primary Th2 act 


4.6 


Microvascular Dermal EC 


3.6 
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1 none 




Primary Tri act t 


>.4 I 


vlicrosvasular Dermal EC 

I JNraipna T IJj-1 vela 


i.8 


Primary Thl rest 


1.9 ! 


Bronchial epithelium 

I Nr aipna + IL 1 oeia 


> c 
LD 


Primary Th2 rest 


< 

I.l 


^mall airway epithelium 
lone 


2,9 


Primary Trl rest 


1.8 


Small airway epittielium 
TNFalpha + IL-lbeta 


8.8 


CD45RA CD4 
lymphocyte act 


1.4 


Coronery artery SMC rest 


0.6 


CD45RO CD4 
lymphocyte act 


3.6 


Coronery artery SMC 
TNFalpha + IL-lbeta 


0.1 


CDS lymphocyte act 


3.8 


Astrocytes rest 


4.8 


Secondary CDS 
lymphocyte rest 


1.8 


Astrocytes TNFalpha + IL- 
lbeta 


3.6 


Secondary CDS 
lymphocyte act 


2.5 


KU-812 (Basophil) rest 


6.6 


CD4 lymphocyte none 


0.6 


KU-812 (Basophil) 
r^JViA/ lonomy cm 




2ry Thl/Th2/Trl anti- 
CD95CH11 


0.8 


none 


A A 


LAK cells rest 


2.2 


TNFalpha + IL-1 beta 


1.8 




LAK cells lL-2 


2.1 


Liver cirrhosis 




LAK ceils IL-2+IL-12 


2.4 


Lupus kidney 


0.4 


LAK cells IL-2+1FN 
gamma 


2.9 


NCI-H292 none 


7.0 


LAK cells IL-2+IL-18 


2.4 


NCI-H292 IL-4 


8.8 


LAK cells 
PMA/ionomycin 


1.2 


NCI-H292 IL-9 


7.4 


NK Cells IL-2 rest ll.4 


NCI-H292 lL-13 


3.9 


Two Way MLR 3 day 


1.5 


NCI-H292 IFN gamma 


4.1 


Two Way MLR 5 day 


1.6 


HPAEC none 


1.8 


Two Way MLR 7 day 


1.4 


HPAFC TNF aloha + IL-1 
beta 


2.0 


PBMC rest 


0.0 


Limg fibroblast none 


1.4 


PBMC PAVM 


6.7 


I nnff fibroblast TNF aloha + 

x-/\*jLlKj Ally* Ji i fc*ik.»^j*fc* 

IL-1 beta 


0.6 


PBMC PHA-L 


3.4 


Lung fibroblast IL-4 


2.4 


Ramos (B cell) none 


3.6 


Lung fibroblast IL-9 


2.8 


Ramos (B cell) 
ionomycin 


I1O.7 


Limg fibroblast lL-13 


1.7 


B lymphocytes PWM 


9.4 


Lung fibroblast IFN gamma 


1.6 


B lymphocytes CD40L 
and IL-4 


2.1 


Dermal fibroblast CCD1070 
rest 


3.8 


EOL-1 dbcAMP 


2.2 


Dermal fibroblast CCD1070 
TNF alpha 


4.1 



EOL-1 dbcAMP 

1 ivirxj iviixjxUj Mil, 


0.8 


Dermal fibroblast CCD 1070 
IL-1 beta 


0.4 


Dendritic cells none 


27.0 


Dermal fibroblast IFN 
gamma 


I.O 


Dendritic cells LPS 


28.9 


Dermal fibroblast lL-4 


2.8 


Dendritic cells anti- 
CD40 


41. O 


lou vxOiins z 


U.l 


Monocytes rest 


0.9 


IBD Crohn's 


6.0 


Monocytes LPS 


0.2 


Colon 


100.0 


Macrophages rest 


22.5 


Lung 


2.0 


Macrophages LPS 


0.8 


Thymus 


11.6 


HUVEC none 


2.8 


Kidney 


1.6 


HUVEC starved 


7-4 







CNS_Beurodegeneratioii_vl.O Summary: Ag3204 The NOV36 gene is found to be 
significantly (p = 0.0008) downregulated in the temporal cortex of Alzheimer's disease 
patients when compared to controls. A close homolog of this gene has been shown to mediate 
neurotoxicity via amyloid beta binding. The NOV36 gene may therefore be an excellent drug 
target for the treatment of Alzheimer's disease, specifically for blocking amyloid beta induced 
neuronal death. Results from a second experiment with the same probe and primer are not 
included. The amp plot indicates there were experimental difficulties with this run. 

References: 

He XY, Schulz H, Yang SY.A human brain L-3-hydroxyacyl-coenzyme A 
dehydrogenase is identical to an amyloid beta-peptide-binding protein involved in 
Alzheimer's disease. J Biol Chem 1998 Apr 24;273(17):10741-6 

A novel L-3-hydroxyacyl-CoA dehydrogenase from human brain has been cloned, 
expressed, purified, and characterized. This enayme is a homotetramer with a molecular mass 
of 108 kDa, Its subunit consists of 261 amino acid residues and has structural features 
characteristic of short chain dehydrogenases. It was found that the amino acid sequence of 
this human brain enzyme is identical to that of an endoplasmic reticulum amyloid beta- 
peptide-binding protein (ERAB), which mediates neurotoxicity in Alzheimer's disease (Yan, 
S. D., Fu, J., Soto, C, Chen, X., Zhu, H., Al-Mohanna, F., Collison, K., Zhu, A., Stem, E., 
Saido, T., Tohyama, M., Ogawa, S., Roher, A., and Stem, D. (1997) Nature 389, 689-695). 
The purification of human brain short chain L-3-hydroxyacyl-CoA dehydrogenase made it 
possible to characterize the structural and catalytic properties of ERAB. This NAD+- 
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dependent dehydrogenase catalyzes the reversible oxidation of L-3-hydroxyacyl-CoAs to 
form 3-ketoacyl-CoAs, but it does not act on the D-isomers. The catalytic rate constant of the 
purified enzyme was estimated to be 37 s-1 with apparent Km values of 89 and 20 &mgr;M 
for acetoacelyl-CoA and NADH, respectively- The activity ratio of this enzyme for substrates 
with chain lengths of C4, C8, and CI 6 was approximately 1 :2:2. The human short chain L-3- 
hydroxyacyl-CoA dehydrogenase gene is organized into six exons and five introns and maps 
to chromosome Xpl 1 .2. The amino-terminal NAD-binding region of the dehydrogenase is 
encoded by the first three exons, whereas the other exons code for the carboxyl-terminal 
substrate-binding region harboring putative catalytic residues. The results of this study lead to 
the conclusion that ERAB involved in neuronal dysfunction is encoded by the human short 
chain L-3-hydroxyacyl-CoA dehydrogenase gene. 

Panel 13D Sammary: Ag3204 The NOV36 gene is expressed at a low level in most of the 
cancer cell lines and normal tissues. There appears to be significantly higher expression in 
colon, lung, breast and ovarian cancer cell lines with the highest expression shown by a colon 
cancer cell line (CT=30.94). Thus, therapeutic inhibition of the NOV36 gene product, 
through the use of small molecule drugs, might be of utility in the treatment of the above 
listed cancer types. 

Among tissues with metabolic function, this gene has low levels of expression in 
pancreas, thyroid, pituitary, adult and fetal heart, adult and fetal liver, adult and fetal skeletal 
muscle, and adipose. This gene product may be a small molecule target for the treatment of 
metabolic and endocrine disease, including the thyroidopathies. Types 1 and 2 diabetes and 
obesity. 

In addition, this panel confirms the expression of this gene in the CNS. See panel 
CNSJMeurodegeneration for a discussion of utility of this gene in the central nervous system. 

Panel 4D Summary: Ag3204 The NOV36 transcript is expressed at significant levels in the 
colon and in some types of antigen presenting cells (APC'S) including activated dendritic 
cells, resting macrophages, and activated B cells. This pattern of expression suggests that the 
protein encoded by this transcript may be involved in gut immunity, particularly in the 
function or maintenance of APC's. The NOV36 transcript encodes a putative reductase. 
Therefore, regulation of reductase expression could function by modulating gut immunity and 
be important in the treatment of inflammatory bowel diseases. 
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Example 4> Differential Gene Expression in clear cell Renal cell carcinomas 
vs normal adjacent tissues 

To obtain a comprehensive profile of those genes whose expression is modulated in 
clear cell Renal cell carcinomas, GeneCalling™ technology, described in detail in Shimkets 
et al. ( 1 999) and in US Patent No. 5871 697, was used to distinguish the gene expression 
profile of clear cell Renal cell carcinoma tissues with the normal adjacent tissues, obtained 
from the same patient, during surgical nephrectomy. The tissues were provided to CuraGen 
from the NDRI under an IRB approved protocol. GeneCalling™ technology relies on 
Quantitative Expression Analysis to generate the gene expression profile of a given sample 
and then generates differential expression analysis of pair-wise comparison of these profiles 
to controls. The comparison in this example is a pool of all tumor tissues vs. a pool of all 
normal tissues. Polynucleotides exhibiting differential expression were confirmed by 
conducting a PGR reaction according to the GeneCalling™ protocol, with the addition of a 
competing unlabelled primer that prevents the amplification from being detected. 

Table 2: Genecalling results from Job 36320 - all kidney cancer vs all Kidney NA T 
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1 ap^?814t ada)12 



■p. 



Btw»lET-549- 

L-ng ^t-« «^ NC1-H522 " 
Ling (pon-fcce^ HOP-6; 

Lfc-^ (BOi-a.-ft ce-i) A543 " I 
i.ofigftMge«5)NCt-H4€3- " 

Liver Osb^V- 
Live'- 

fteiw LO-Ci- 
flenalACHN- 1 

Bcnel 766-3- 

Kidney- 
Trac*»ea- 



^ 

Gotor «CC-:^- I 
Colon Cn ass Js(CX)03S6e'» 

SrraH inlestr«' 



Coteiecat- 
tynpbnooe' 



Heort- 

«lioSN&-7'5- 

wttSA17S3- 

^nsl cord ' 
Cerefaro" Corte>. " 

Btsns {J»?)poeBiTisust ■ 

Bmm (whole) 
PduilEiysbird- I 

Pancfsas' 
Heendea >- 
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Surprisingly, several cDNA fragments from ARP were over-expressed in 9 out of 1 1 
clear cell Renal cell carcinomas. 

5 Example 5. TaoMan™ Analysis of ARP 

ARP was then subjected to Taqman™ analysis (TaqMan™ polymerase chain reaction 
detection; Perkin Elmer, Applied Biosystems Division, Foster City, CA). The specific details 
of the PGR reactions are as follows: 

10 Tissues were ground to a fine powder under liquid N2 using a motorized grinding mill 

(Certiprep, # 6800-1 1 5) and made into lysate by addition of Trizol (Life Tech, cat.# 1 5596- 
0 1 8) @ LO ml Trizol/1 00 mg tissue. Total RNA was extracted from this lysate by extraction 
with BCP (bromochloropropane; MRC, BP-151), added in an amount equal to one tenth the 
volume of the Trizol lysate, followed by precipitation of total RNA from the aqueous layer by 

1 5 addition of an equal volume of isopropanol. The precipitate was recovered by spinning the 
solution at 13,200 rpm for 10 minutes in micro-centrifuges (or at 9000 rpm for 15 mins in 
Beckman G15 centrifuge for lysate volume > 1.0 ml). The precipitates were washed once 
with 70% ethanol, air dried briefly and resuspended in 100 ^il DEPC treated water (Ambion, 
cat.#9920). 

20 To remove any genomic DNA contamination from the resuhing total RNA 

preparations, they were treated with DNase (2 ^1; 1 0 u/^il; Qiagen, cat.# 79254) in the 
presence of Ix DNase buffer from Promega for 30 min at 37 ^ C. RNA was extracted by 
addition of equal volumes of acid phenol: chloroform (Ambion, catJ 9720), followed by 
precipitation from the aqueous phase with 03 M sodium acetate (Fluka, cat J 71 196) and 
25 two volumes of ethanol. The precipitate was recovered by spinning as above, washed once 
with 70% ethanol and resuspended in 50 ^\ DEPC treated water. 

RNA was quantitated fluorometrically (Tecan SpectraFluor Plus) using a RNA 
specific dye, Ribogreen (Molecular Probes, Eugene OR; Catalog number R-1 1491) according 
to the manufacturer's directions. The quality of the RNA was determined by running the 
30 RNA either on agarose-formaldehyde gels or RNA chips (Agilent 5064-8229) from Agilent 
Technology (2100 Bioanalyser). 

The RNA samples for each cell or tissue were normalized according to RNA input by 
RNA quantification using Ribogreen (as described above) using a standard curve covering 

the concentration range of 1 ng/ml through 50 ng/ml RNA. 
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Absence of genomic DNA contamination in every RNA sample was confirmed by 
monitoring the expression of human polypeptide chain elongation factor- 1 alpha (GenBank 
Accession Number: E02629) and human ADP-ribosylation factor 1 (ARFl) mRNA 
(GenBank Accession Number: M36340) by TAQMAN®, without performing a reverse 
transcription step prior to the PGR cycles (minus RT-TAQMAN® assay). Ten ng of RNA 
(total or polyA+) were used in a 25 ul TAQMAN® reaction using probe and primer sets 
specific for intronless segments of human polypeptide chain elongation factor-1 alpha and 
himian ADP-ribosylation factor 1 (ARFl) mRNA. Probe and primers sets were designed for 
each assay according to a proprietary software package. Reactions were carried out using the 
TAQMAN® universal PGR Master Mix (Applied Biosystems, Foster City, CA, USA; cat # 
4304447) according to the manufacturer's protocol. Reactions were performed using 96 well 
optical plates and caps (Applied Biosystems, cat # 403012) on an ABI Prism 7700® 
Sequence Detection System (Applied Biosystems) using the following parameters: 10 min at 
95^C; 1 5 sec at 95^G/1 min at 60^C (40 cycles). Results were recorded as GT values (cycle at 
which a given sample crosses a threshold level of fluorescence) using a log scale. Any sample 
showing a GT value lower than 35 for any of the two tested genes were treated again with 
DNAse 1 following the protocol described previously. 

RNA (2-1 0 iig total or polyA+) was converted to cDNA using Superscript II(Life 
Tech; cat# 18064-147 ) and random hexamers. Reactions were performed in a volume of 20 
|il and incubated for 60mins at 42^C to generate the single stranded cDNA (sscDNA). 

sscDNA was then diluted in DEPC-water to a final concentration of 0.2 ng/^1 
(assuming a 1 : 1 RNA to cDNA conversion ratio). Five |xl of sscDNA was transferred to a 
separate plate for the TAQMAN® reaction using probe and primer sets specific for human 
polypeptide chain elongation factor-1 alpha and human ADP-ribosylation factor 1 (ARFl) 
mRNA. TAQMAN® reactions were performed following the minus RT-TAQMAN® assay 
protocol described previously. Results were recorded as CT values, with the difference in 
RNA concentration between a given sample and the sample with the lowest CT value being 
represented as 2 to the power of delta CT (2^^^). The percent relative expression is then 
obtained by taking the reciprocal of this RNA difference and multiplying by 1 00. The 
median CT values obtained for two housekeeping genes: human polypeptide chain elongation 
factor-1 alpha (hEF-la) and human ADP-ribosylation factor 1 (hARFl) were used to 
normalize sscDNA samples within each panel. The concentrations of the sscDNA samples 
were adjusted so as to be within the median CT value, +/- one CT unit for these two 
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housekeeping genes. After every round of sscDNA concentration adjustment, the relative 
gene expression for hEF-la and hARFl sscDNA was measured by TAQMAN® as described 

previously. 

Normalized sscDNA (5 fil) for each sample was analyzed via TAQMAN® following 
the minus RT-TAQMAN® assay protocol described previously. Probes and primers were 
designed for each assay according to a proprietary software package using the sequence of 
GenBank Accession number AFl 53606, AF169312, or AF202636 as input. The primers and 
probe were designed to also specifically identify the gene of the invention irrespective of the 
presence of related human genes, such as splice forms, homologs and paralogs. The primers 
and probe are shown in Table 2. 



Table 3. Primer-probe set 2012. 



Primers 


quences 


TM 


Length 


Start 
Position 


SEQ 

ID 

NO: 


Forward 


5 • -AAGGCTCAGAACAGCAGGAT-3 ' 


59 


20 


478 


545 


Probe 


TET- 5 • -CAACTCTTCCACAAGGTGGCCCAG-3 » - 
TAMRA 


70 


24 


502 


546 


Reverse 


5 ' -GCTTTGCAGATGCTGAATTC-3 » 


58.6 


20 


557 


547 



Default settmgs were used for reaction conditions and the following parameters were 
set before selecting primers: primer concentration == 250 nM; primer melting temperature 
(Tm) range = 58*'-60** C; primer optimal Tm = 59° C; maximum primer difference == 2° C; 
probe does not have 5' G; probe T^ must be 1 0"* C greater than primer T^; amplicon size is 
75 bp to 1 00 bp, optimal amplicon size = 80 bp. The probes selected (see below) were 
synthesized by Synthegen (Houston TX, USA), Applied Biosystems (Foster City CA, USA), 
or Biosearch Technologies, Inc. (Novato CA, USA). Primers were synthesized by Life 
Technologies (Rockville MD, USA). Probes were purified first by anion exchange HPLC, 
followed by reverse phase HPLC to remove uncoupled dyes and non full length products. 
Primers were fully de-protected, and desalted using a C-18 spin-column. All TAQMAN® 
reactions were performed using 250 nM of probe and 1.125 pM of reverse and forward 
primers. 

RTO-PCR Panel 1 Description 

This 96 well plate (2 control wells, 94 test samples) panel and its variants (Panel 1 .X, 
etc.) are composed of KNA/cDNA isolated from various human cell lines that have been 
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established from human malignant tissues (Tumors), These cell lines have been extensively 
characterized by investigators in both academia and the commercial sector regarding their 
tumorgenicity, metastatic potential, drug resistance, invasive potential and other cancer- 
related properties. They serve as suitable tools for pre-clinical evaluation of anti-cancer 
agents and promising therapeutic strategies. RNA from these various human cancer cell lines 
was isolated and procured for CuraGen Corporation by the Developmental Therapeutic 
Branch (DTB) of the National Cancer Institute (USA). Basic information regarding their 
biological behavior, gene expression, and resistance to various cytotoxic agents are provided 
by the DTB (http://dtp.nci.nih.gov/). 

In addition, RNA/cDNA was obtained from various human tissues derived from 
human autopsies performed on deceased elderly people or sudden death victims (accidents, 
etc.). These tissues were ascertained to be free of disease and were purchased from various 
high quality commercial sources such as Clontech, Research Genetics, and Invitrogen. 

RNA integrity from all samples is controlled for quality by visual assessment of 
agarose gel electrophoresis using 28s and 1 8s ribosomal RNA staining intensity ratio as a 
guide (2:1 to 2.5:1 28s: 18s) and the presence of low molecular weight RNAs indicative of 
degradation products. Samples are quality controlled for genomic DNA contamination by 
reactions run in the absence of reverse transcriptase using probe and primer sets designed to 
amplily across the span of a single exon. 

RTO-PCR Panel 2 Descriptioa 

This 96 well (2 control wells, 94 test samples) panel and its variants (Panel 2X, etc.) 
are composed of RNA/cDNA isolated from human tissue procured by surgeons working in 
close cooperation with the National Cancer Institute's Cooperative Human Tissue Network 
(CHTN) or the National Disease Research Initiative (NDRI). The tissues procured are 
derived from human malignancies and in cases were indicated many malignant tissues have 
"matched margins". The tumor tissue and the "matched margins" are evaluated by two 
independent pathologists (the surgical pathologists and again by a pathologists at NDRI or 
CHTN). This analysis provides a gross histopathological assessment of tumor differentiation 
grade. Moreover, most samples include the original surgical pathology report that provides 
information regarding the clinical stage of the patient. These matched margins are taken from 
the tissue surrounding (i.e. immediately proximal) to the zone of surgery. In addition, 
RNA/cDNA was obtained from various human tissues derived from human autopsies 
performed on deceased elderly people or sudden death victims (accidents, etc.). These tissue 
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were ascertained to be i&ee of disease and were purchased from various high quality 
commercial sources such as Clontech, Research Genetics, and Invitrogen. 

RNA integrity from all samples is controlled for quality by visual assessment of 
agarose gel electrophoresis using 28s and 1 8s ribosomal RNA staining intensity ratio as a 
5 guide (2: 1 to 2.5: 1 28s: 1 8s) and the presence of low molecular weight RNAs indicative of 
degradation products. Samples are quality controlled for genomic DNA contamination by 
reactions run in the absence of reverse transcriptase using probe and primer sets designed to 
amplify across the span of a single exon. 
RTO-PCR Panel 4 Description 
J «j 10 This 96 well plate (2 control wells, 94 test samples) is composed of RNA (Panel 4r) 

O or cDNA (Panel 4d) isolated from various human cell lines or tissues. Total RNA from 

j2 control normal tissues: colon, and lung were purchased from Stratagene; thymus and kidney 

f^: total RNA was obtained from Clontech. Total RNA from liver tissue from Cirrhosis patients 

53 and kidney from Lupus patients were obtained from Biochain. Intestinal tissue for RNA 

15 preparation from Crohns and Ulcerative colitis patients was obtained from the National 
PI Disease Research Interchange (NE>RI) (Philadelphia, PA). 

^ 5 Astrocytes, lung fibroblasts, derma! fibroblasts, coronary artery smooth muscle cells, 

B small airw^ epithelium, bronchial epithelium, microvascular dermal endothelial cells, 

ri t 

' microvascular lung endothelial cells, human pulmonary aortic endothelial cells, human 

20 umbilical vein endothelial cells were all purchased from Clonctics and grown in the media 
supplied for these cell types by Clonetics. These primary cell types were activated with 
various cytokines or combinations of cytokines for 6 and/or 12-14 hours. The following 
cytokines were used; IL-1 beta at approximately 1 -5 ng^ml, TNF alpha at approximately 5-1 0 
ng/ml, IFN gamma at approximately 20-50 ng/ml, IL-4 at approximately 5-10 ng/ml, IL-9 at 
25 approximately 5-10 ng/ml, IL-13 at approximately 5-10 ng/ml. For endothelial cells we 
sometimes starved the cells for various times by culture in the basal media from Clonetics 
with 0.1% serum. 

Mononuclear cells were prepared from blood of employees at CuraGen Corporation, 

using FicolL LAK cells were prepared from these cells by culture in DMEM 5% FCS 

30 (Hyclone), 1 00 non essential amino acids (Gibco), 1 mM sodium pyruvate (Gibco), 

mercaptoethanol 5.5 x 10"^ M (Gibco), and 10 mM Hepes (Gibco) and Interleukin 2 for 4-6 

days. Ceils were then either activated with 1 0-20 ng/ml PMA and 1-2 fig/ml ionomycin, IL- 

12 at 5-1 0 ng/ml, IFN gamma at 20-50 ng/ml and IL-l 8 at 5-1 0 ng/ml for 6 hours. In some 

cases, mononuclear cells were cultured for 4-5 days in DMEM 5% FCS (Hyclone), 100 yM 
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non essential amino acids (Gibco), 1 mM sodium pyruvate (Gibco), mercaptoethanol 5.5 x 
10"^ M (Gibco), and 10 mM Hepes (Gibco) with PHA or PWM at approximately 5 Mg/ml. 
Samples were taken at 24, 48 and 72 hours for RNA preparation. MLR samples were 
obtained by taking blood from two donors, isolating the mononuclear cells using Ficoll and 
mixing the isolated mononuclear cells 1:1 at a final concentration of approximately 2x10^ 
cells/ml in DMEM 5% FCS (Hyclone), 100 )liM non essential amino acids (Gibco), 1 mM 
sodium pyruvate (Gibco), mercaptoethanol (5.5 x 10'^ M) (Gibco), and 10 mM Hepes 
(Gibco). The MLR was cultured and samples taken at various time points ranging from 1-7 
days for RNA preparation. 

To prepare monocytes, macrophages and dendritic cells, monocytes were isolated 
from mononuclear cells using CD14 Miltenyi Beads, -f-ve VS selection columns and a Vario 
Magnet as per the manufacturer's instructions. Monocytes were differentiated into dendritic 
cells by culture in DMEM 5% FCS (Hyclone), 100 iiM non essential amino acids (Gibco), 1 
mM sodium pyruvate (Gibco), mercaptoethanol 5.5 x 10'^ M (Gibco), and 10 mM Hepes 
(Gibco), 50 ng/ml GMCSF and 5 ng/ml IL-4 for 5-7 days. Macrophages were prepared by 
culture of monocytes for 5-7 days in DMEM 5% FCS (Hyclone), 100 |liM non essential 
amino acids (Gibco), 1 mM sodium pyruvate (Gibco), mercaptoethanol 5.5 x 10"^ M (Gibco), 
10 mM Hepes (Gibco) and 1 0% AB Human Serum or MCSF at approximately 50 ng/ml. 
Monocytes, macrophages and dendritic cells were stimulated for 6 and 12-14 hours with LPS 
at 100 ng/ml. Dendritic cells were also stimulated with anti-CD40 monoclonal antibody 
(Pharmingen) at 10 |j.g/ml for 6 and 12-14 hours. 

CD4 lymphocytes, CDS lymphocytes and NK cells were also isolated from 
mononuclear cells using CD4, CDS and CD56 Miltenyi beads, positive VS selection columns 
and a Vario Magnet as per the manufacturer's instructions. CD45RA and CD45RO CD4 
lymphocytes were isolated by depleting mononuclear cells of CDS, CD56, CD14 and CD19 
cells using CDS, CD56, CD14 and CD19 Miltenyi beads and +ve selection. Then CD4RO 
beads were used to isolate the CD45RO CD4 lymphocytes with the remaining cells being 
CD45RA CD4 lymphocytes. CD45RA CD4, CD45RO CD4 and CDS lymphocytes were 
placed in DMEM 5% FCS (Hyclone), 100 jiiM non essential amino acids (Gibco), 1 mM 
sodium pyruvate (Gibco), mercaptoethanol 5.5 x 10'^ M (Gibco), and 10 mM Hepes (Gibco) 
and plated at 10^ cells/ml onto Falcon 6 well tissue culture plates that had been coated 
overnight with 0.5 ^tg/ml anti-CD28 (Pharmingen) and 3 ug/ml anti-CD3 (OKT3, ATCC) in 
PBS. After 6 and 24 hours, the cells were harvested for RNA preparation. To prepare 
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chronically activated CDS lymphocytes, we activated the isolated CDS lymphocytes for 4 
days on anti-CD28 and anti-CD3 coated plates and then harvested the cells and expanded 
them in DMEM 5% FCS (Hyclone), 100 non essential amino acids (Gibco), 1 mM 
sodium pyruvate (Gibco), mercatoethanol 5.5 x 10'^ M (Gibco), and 10 mM Hepes (Gibco) 
and IL-2. The expanded CDS cells were then activated again with plate bound anti-CD3 and 
anti-CD28 for 4 days and expanded as before. RNA was isolated 6 and 24 hours after the 
second activation and after 4 days of the second expansion culture. The isolated NK cells 
were cultured in DMEM 5% FCS (Hyclone), 1 00 |liM non essential amino acids (Gibco), I 
mM sodium pyruvate (Gibco), mercaptoethanol 5.5 x 10"^ M (Gibco), and 10 mM Hepes 
(Gibco) and IL-2 for 4-6 days before RNA was prepared. 

To obtain B cells, tonsils were procured from NDRI. The tonsil was cut up with 
sterile dissecting scissors and then passed through a sieve. Tonsil cells were then spun down 
and resupended at 10^ cells/ml in DMEM 5% FCS (Hyclone), 100 fxM non essential amino 
acids (Gibco), 1 mM sodium pyruvate (Gibco), mercaptoethanol 5.5 x 10*^ M (Gibco), and 10 
mM Hepes (Gibco). To activate the cells, we used PWM at 5 |Jig/ml or anti-CD40 
(Pharmingen) at approximately 10 p.g/ml and lL-4 at 5-10 ng/ml. Cells were harvested for 
RNA preparation at 24,48 and 72 hours. 

To prepare the primary and secondary Thl/Th2 and Trl cells, six- well Falcon plates 
were coated overnight with 10 ^g/ml anti-CD28 (Pharmingen) and 2 jtig/ml OKT3 (ATCC), 
and then washed twice with PBS. Umbilical cord blood CD4 lymphocytes (Poietic Systems, 

5 6 

German Town, MD) were cultured at 10 -10 cells/ml in DMEM 5% FCS (Hyclone), 100 
yM non essential amino acids (Gibco), 1 mM sodium pyruvate (Gibco), mercaptoethanol 5.5 
x 10-^M (Gibco), 10 mM Hepes (Gibco) and IL-2 (4 ng/ml). IL-12 (5 ng/ml) and anti-IL4 (1 
|Lig/ml) were used to direct to Thl, while IL-4 (5 ng/ml) and anti-IFN gamma (1 jLtg/ml) were 
used to direct to Th2 and IL-10 at 5 ng/ml was used to direct to Trl . After 4-5 days, the 
activated Thl, Th2 and Trl lymphocytes were washed once in DMEM and expanded for 4-7 
days in DMEM 5% FCS (Hyclone), 100 pM non essential amino acids (Gibco), 1 mM 
sodium pyruvate (Gibco), mercaptoethanol 5.5 x 10"^ M (Gibco), 10 mM Hepes (Gibco) and 
IL-2 (1 ng/ml). Following this, the activated Thl, Th2 and Trl lymphocytes were re- 
stimulated for 5 days with anti-CD28/OKT3 and cytokines as described above, but with the 
addition of anti-CD95L (1 ^ig/ml) to prevent apoptosis. After 4-5 days, the Thl, Th2 and Trl 
lymphocytes were washed and then expanded again with IL-2 for 4-7 days. Activated Thl 
and Th2 lymphocytes were maintained in this way for a maximum of three cycles. RNA was 
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prepared from primaiy and secondary Thl , Th2 and Trl after 6 and 24 hours following the 
second and third activations with plate bound anti-CD3 and anti-CD28 mAbs and 4 days into 
the second and third expansion cultures in Interleukin 2. 

The following leukocyte cells lines were obtained from the ATCC: Ramos, EOL-1, 
KU-812. EOL cells were further differentiated by culture in 0.1 mM dbcAMP at 5 xlO^ 
cells/ml for 8 days, changing the media every 3 days and adjusting the cell concentration to 5 
xlO^ cells/ml. For the culture of these cells, we used DMEM or RPMI (as recommended by 
the ATCC), with the addition of 5% PCS (Hyclone), 100 |j.M non essential amino acids 
(Gibco), 1 mM sodium pyruvate (Gibco), mercaptoethanol 5.5 x 10'^ M (Gibco), 10 mM 
Hepes (Gibco). RNA was either prepared from resting cells or cells activated with PMA at 
1 0 ng/ml and ionomycin at 1 ^g/ml for 6 and 14 hours. We also obtained a keratinocyte line 
CCD106 and an airway epithelial tumor line NCI-H292 from the ATCC. Both were cultured 
in DMEM 5% PCS (Hyclone), 100 jjiM non essential amino acids (Gibco), 1 mM sodium 
pyruvate (Gibco), mercaptoethanol 5.5 x 10'^ M (Gibco), and 10 mM Hepes (Gibco). 
CCD 1 106 cells were activated for 6 and 14 hours with approximately 5 ng/ml TNP alpha and 
1 ng/ml lL-1 beta, while NCI-H292 cells were activated for 6 and 14 hours with the 
following cytokines: 5 ng/ml IL-4, 5 ng/ml IL-9, 5 ng/ml IL-13 and 25 ng/ml IPN gamma. 

For these cell lines and blood cells, we prepared RNA by lysing approximately 10^ 
cells/ml using Trizol (Gibco BRL). Briefly, 1/10 volume of Bromochloropropane (Molecular 
Research Corporation) was added to the RNA sample, vortexed and after 10 minutes at room 
temperature, the tubes were spun at 14,000 rpm in a Sorvall SS34 rotor. The aqueous phase 
was removed and placed in a 15 ml Falcon Tube. An equal volume of isopropanol was added 
and left at -20 degrees C overnight. The precipitated RNA was spun down at 9,000 rpm for 
1 5 min in a Sorvall SS34 rotor and washed in 70% ethanol. The pellet was redissolved in 300 
pi of RNAse-free water and 35 jil buffer (Promega) 5 jul DTT, 7 RNAsin and 8 \il DNAse 
were added. The tube was incubated at 37 degrees C for 30 minutes to remove contaminating 
genomic DNA, extracted once with phenol chloroform and re-precipitated with 1/10 volume 
of 3 M sodium acetate and 2 volumes of 100% ethanol. The RNA was spun down and placed 
in RNAse free water. RNA was stored at -80 degrees 

RTO-PCR-AI comprehensive panel vl.O 

The plates for Al_comprehensive panel^vl.O include two control wells and 89 test 
samples comprised of cDNA isolated from surgical and postmortem human tissues obtained 
from the Backus Hospital and Clinomics (Frederick, MD). Total RNA was extracted from 
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tissue samples from the Backus Hospital in the Facility at CuraGen. Total RNA from other 
tissues was obtained from Clinomics. 

Joint tissues including synovial fluid, synovium, bone and cartilage were obtained 
from patients undergoing total knee or hip replacement surgery at the Backus Hospital. 
Tissue samples were immediately snap frozen in liquid nitrogen to ensure that isolated RNA 
was of optimal quality and not degraded. Additional samples of osteoarthritis and rheumatoid 
arthritis joint tissues were obtained from Clinomics. Normal control tissues were supplied by 
Clinomics and were obtained during autopsy of trauma victims. 

Surgical specimens of psoriatic tissues and adjacent matched tissues were provided as 
total RNA by Clinomics. Two male and two female patients were selected between the ages 
of 25 and 47. None of the patients were taking prescription drugs at the time samples were 
isolated. 

Surgical specimens of diseased colon from patients with ulcerative colitis and Crohns 
disease and adjacent matched tissues were obtained from Clinomics. Bowel tissue from three 
female and three male Crohn's patients between the ages of 41-69 were used. Two patients 
were not on prescription medication while the others were taking dexamethasone, 
phenobarbital, or tylenol. Ulcerative colitis tissue was from three male and four female 
patients. Four of the patients were taking lebvid and two were on phenobarbital. 

Total RNA from post mortem lung tissue from trauma victims with no disease or with 
emphysema, asthma or COPD was purchased from Clinomics. Emphysema patients ranged in 
age from 40-70 and all were smokers, this age range was chosen to focus on patients with 
cigarette-linked emphysema and to avoid those patients with alpha-lanti-trypsin deficiencies. 
Asthma patients ranged in age from 36-75, and excluded smokers to prevent those patients 
that could also have COPD. COPD patients ranged in age from 35-80 and included both 
smokers and non-smokers. Most patients were taking corticosteroids, and bronchodilators. 

In the labels employed to identify tissues in the AI_comprehensive panel_vl.O panel, 
the following abbreviations are used: 

AI = Autoimmunity 

Syn = Synovial 

Normal = No apparent disease 

Rep22 /Rep20 = individual patients 
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RA = Rheumatoid arthritis 

Backus = From Backus Hospital 

OA = Osteoarthritis 

(SS) (BA) (MF) = Individual patients 

Adj = Adjacent tissue 

Match control = adjacent tissues 

-M = Male 

-F - Female 

COPD = Chronic obstructive pulmonary disease 



The results are shown in Tables 4-8 
Table 4. TaqMan data for Panel 13 



Tissue Name 


Rel. 

Expr. 

% 


Tissue Name 


Rel. 
Expr. % 


xAvGT auenov'arvjnunia 


zo.z 


Kcilal /oO-U 


U.l 


Heart (fetal) 


10.0 


Renal A498 


100.0 


Pancreas 


4.1 


Renal RXF 393 


4.8 


Pancreatic ca. CAPAN 2 


3.4 


Renal ACHN 


3.5 


Adrenal gland 


11.2 


Renal UO-31 


2.0 


Thyroid 


13.9 


Renal TK-IO 


3.3 


Salivary gland 


2.9 


Liver 


8.7 


Pituitary gland 


2.7 


Liver (fetal) 


12.0 


Brain (fetal) 


4.4 


Liver (hepatoblast) HepG2 


5.7 


Brain (whole) 


11.1 


Lung 


18.7 


Brain (amygdala) 


7.3 


Lung (fetal) 


4.4 


Brain (cerebellum) 


0.9 


Lung (small cell) LX-1 


0.6 


Brain (hippocampus) 


21.0 


Lung (small cell) NCI-H69 


0.6 


Brain (thalamus) 


8.0 


Lung (s-cell var.) SHP-77 


1.0 


Cerebral Cortex 


22.9 


Lung (large cell)NCI-H460 


1.8 


Spinal cord 


17.1 


Lung (non-sm. cell) A549 


3.1 


Glio/astroU87-MG 


2.7 


Lung (non-s.cell) NCI-H23 


6.3 


Glio/astro U-llS-MG 


38.2 


Lung (non-s.cell) HOP-62 


23.7 


astro SW 1783 


20.2 


Lung (non-s.cl) NCI-H522 


10.8 


neuro; met SK-N-AS 


10.7 


Lung (squam.) SW 900 


1.2 


astro SF-539 


0.3 


Lung (squam.) NCI-H596 


0.0 


astro SNB-75 


15.7 


Mammary gland 


35.1 


GlioSNB-19 


0.0 


Breast (pl.eOMCF-7 


2.6 


Glio U251 


0.1 


Breast (pl.ef) MDA-MB-231 


6.3 


Glio SF-295 


4.3 


Breast (pl.ef) T47D 


8.0 


Heart 


2.9 


Breast BT-549 


40.6 


Skeletal muscle 


2.2 


Breast MDA-N 


0.8 
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Bone marrow 


6.7 


Ovary 


14.1 


Thymus 


3.7 


Ovarian OVCAR-3 


0.4 


Spleen 


4.9 


Ovarian OVCAR-4 


1.3 


_ — 

Lymph node 


6.4 


Ovarian OVCAR-5 


6.2 


Colorectal 


3.9 


Ovarian OVCAR-8 


0.4 


Stomach 


5.7 


Ovarian IGROV-1 


0.0 


Small intestine 


5.3 


Ovarian (ascites) SK-OV-3 


3.5 


Colon SW480 


1.3 


Uterus 


4.5 


Colon SW620(SW480 met) 


0.2 


Placenta 


95.9 


Colon HT29 


0.6 


Prostate 


9.3 


Colon HCT-1 16 


2.6 


Prostate (bone met)PC-3 


2.7 


Colon CaCo-2 


0.8 


Testis 


2.9 


Colon Ca tissue(OD03866) 


23.7 


Melanoma Hs688(A).T 


4.4 


Colon HCC-2998 


3.9 


Melanoma (met) Hs688(B).T 


27.7 


Gastricdiver met) NCI-N87 


6.6 


Melanoma UACC-62 


0.2 


Bladder 


6.0 


Melanoma M14 


0.0 


Trachea 


6.1 


Melanoma LOXIMVI 


1.2 


Kidney 


0.4 


Melanoma (met) SK-MEL-5 


0.0 


Kidney (fetal) 


22.1 


Adipose 


59.9 



Table 5. TaqMan data for Panel 2. 



Tissue Name 


Rel. 
Expr. 

% 


Tissue Name 


Rel. 
Expr. 

% 


Normal Colon 


1.0 


RCC 7 Margin 


0.1 


CCa 1 


0.9 


RCC8 


0.1 


CCa 1 Margin 


0.6 


RCC 8 Margin 


0.9 


CCa 2 


0.5 


RCC 9 


24.7 


CCa 2 Margin 


0.5 


RCC 9 Mai^in 


1.1 


CCa 3 


0.1 


Normal Uterus 


0.1 


CCa 3 Margin 


0.3 


UtCa 1 


0.5 


CCa 4 


0.2 


Normal Thyroid 


1.1 


CCa 4 Margin 


0.2 


ThyCa 1 


1.5 


CCa 5 Metastasis 


1.6 


ThyCa2 


0.2 


CCa 5 Margin (Liver) 


1.9 


ThyCa 2 Margin 


0.5 


CCa 6 


0.5 


Normal Breast 


0.9 


CCa 6 Margin (Lung) 


0.2 


BrCa 1 


0.2 


Normal Prostate 


1.1 


BrCa2 


0.4 


PCa 1 


0.2 


BrCa 3 Metastasis 


2.0 


PCa 1 Margin 


0.5 


BrCa 4 Metastasis 


0.5 


PCa 2 


0.3 


BrCa 5 


0.2 


PCa 2 Margin 


0.4 


BrCa 6 


0.5 


Normal Lung 


0.3 


BrCa 6 Margin 


0.5 


LCa 1 Metastasis 


1.6 


BrCa 7 


0.4 


LCa 1 Margin (Muscle) 


4.2 


BrCa 7 Margin 


0.3 


LCa 2 


0.4 


Normal LIvct 


1.8 
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LCe 2 Margin 


0.3 


HCC 1 


4.2 


LCa3 


2.5 


HCC2 


3.6 


LCa 3 Margin 


1.7 


HCC 3 


2.0 


LCa4 


0.4 


HCC 4 


7.1 


LCa 5 


0.6 


HCC 4 Margin 


0.9 


LCa 5 Margin 


4.6 


HCC 5 


2.8 


Ocular Melanoma Metastasis 


0.0 


HCC 5 Margin 

o 


1.4 


Ocular Melanoma Margin (Liver) 


2.5 


Normal Bladder 


2.3 


Melanoma Metastasis 


0.6 


TCC 1 


0.7 


Melanoma Margin (Lung) 


1.6 


TCC2 


0.4 


Tsformal Kidnev 


0.1 


TCC 3 


2.9 


RCC 1 


0.5 


TCC 3 Margin 


2.7 


RCC 1 Marein 


0.9 


Normal Ovary 


0.4 


RCC 2 


12 


OvCa 1 


0.3 


RCC 2 Margin 


1.3 


OvCa2 


4.1 


RCC 3 


100.0 


OvCa 2 Margin 


3.1 


RCC 3 Meirgin 


1.0 


Normal Stomach 


0.3 


RCC 4 


8.1 


Normal Stomach 


0.1 


RCC 4 Margin 


0.6 


GaCal 


0.2 


RCC 5 


32.3 


GaCa 1 Margin 


0.3 


RCC 5 Margin 


0.2 


GaCa2 


0.2 


RCC 6 


0.2 


GaCa 2 Margin 


0.2 


RCC 6 Margin 


0.2 


GaCa3 


0.4 


RCC 7 


0.2 


RCC 7 Margin 


0.1 



Table 6. TaqMan data for Panel 3. 



Tissue Name 


ReL 

Expr 

.% 


Tissue Name 


ReL 

Expr 

.% 


94905_Daoy__Medulloblastoma/Cerebellum_s 
scDNA 


0.1 


94954_Ca Ski^Cervical 
epidermoid carcinoma 
(metastasis)_sscDNA 


5.5 


94906_TE67 l_Medulloblastom/Cerebellum_s 

scDNA 


3.2 


94955_ES-2_Ovarian clear 
cell carcinoma sscDNA 


1.5 


94907_D283 

Med_Medulloblastoma/Cerebellum_sscDNA 


0.4 


94957_Ramos/6h stim_"; 
Stimulated with 

PMA/ionomycin 6h sscDNA 


0.0 


94908_PFSK-l_Primitive 
Neuroectodermal/Cerebellum_sscDNA 


3.2 


94958__Ramos/14h stimj*; 
Stimulated with 
PMA/ionomycin 
14h sscDNA 


0.0 


94909_XF-498_CNS__sscDNA 


3.1 


94962_MEG-01_Chronic 
myelogenous leukemia 
(megokaryoblast) sscDNA 


0.9 


94910_SNB-78__CNS/glioma_sscDNA 


7.6 


94963_Raji__Burkitt's 

lymphomasscDNA 


0.2 


9491 l__SF-268_CNS/glioblastoma_sscDNA 


2.9 


94964_Daudi_Burkitt's 
lymphoma_sscDNA 


0.3 
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94912_T98G_Gnoblastoma_sscDNA 


0.2 


94965_U266_B-cell 

plasmacytoma/myeloma_ssc 

DNA 


1.1 


96776_SK-N-SH_Neuroblastoma 


11.7 


94968_CA46_Burkitt's 
lymphoma sscDNA 


0.0 


949 1 3_SF-295_CNS/glioblastoma_sscDN A 


1.4 


94970_RL_non-Hodgkin's 
B-cell lymphoma sscDNA 


0.0 


94914 Cerebellum_sscDNA 


5.5 


94972_JMl_pre-B-cell 
l3mphoma/leukemia sscDN 
A 


0.2 


yyjtt/ v^CiCUCllUIIl oawJ^l^/T. 


3.4 


94973 Jurkat Tcell 
leukemia sscDNA 


1.2 


94916JS[CI-H292_Mucoepidennoid lung 


8.2 


94974_TF- 

1 Rrvthroleukemia sscDNA 


0.3 


94917_DMS-1 14_Small cell lung 

Cancer SSCJL'XNvrt. 


1.1 


94975_HUT 78_T-cell 

IvmnHrnna <;<5cDNA 


0.6 


94918_DMS-79_Sniall cell lung 

Canccr/ncurocjiuov.'rjjjc iji*^!^!^/^ 


8.5 


94977_U937_Histiocytic 

Ivmnhoma <;^cOMA 


0.4 


94919_NCI-H146_SmalI cell lung 
Canccr/neurocnuot'rinc_^abvJLyiNr\ 


0.6 


94980_KU- 

leukemia sscDNA 


0.5 


OAO^n lsJr'l-H^7^ ^m5»n lima 

cancer/neuroendocrine sscDNA 




94981 769-P Clear ceil 
renal carcinoma sscDNA 


3.1 


94921 JSrCI-N417_Small cell lung 
Ctinoer/ncuruciiuucrjiic bbv^i^/iN/Ti. 


0.0 


94983_Caki-2_Clear cell 


35.9 


94923_NCI-H82_Small cell lung 
cancer/neuroendocrine sscDNA 


0.3 


94984_SW 839_CIear cell 
renal carcinoma sscDNA 


24.5 


94924_NCI-H1 57_Squamous cell lung cancer 
(metastasis) sscDNA 


0.2 


94986_G401_Wilms' 


0.9 


94925_NCI-H1 155_Large cell lung 
cancer/neuroendocrine_sscDNA 


0.5 


94987_Hs766T_Pancreatic 
carcinoma (LN 


18.3 


94926_NCI-H1299_Large cell lung 

\/al\\AZxi 1 JC UI vC! J UU^I 1 1 1C_J»V/jL/1N^ 


37.9 


94988_CAPAN- 

1 Pflncreattc 
adenocarcinoma (liver 
rneta<itasi sscDNA 


2.6 


94927_NCI-H727_Lung carcinoid_sscDNA 


0.7 


94989_SU86,86_Pancreatic 

carcinoma (liver 

metastasis^ sscDNA 


0.2 


94928_NCI-UMC-1 l_Lung 
carcinoid sscDNA 


0.7 


94990_BxPC-3_Pancreatic 
adenocarcinoma sscDNA 


5.8 


94929 T X-1 5imall cell lune cancer sscDNA 


0.0 


94991 HPAC Pancreatic 
adenocarcinoma sscDNA 


1.4 


94930_Colo-205_Colon cancer_sscDNA 


0.8 


94992_MIAPaCa- 
2_Pancreatic 
carcinoma sscDNA 


2.5 


94931_KM12_Colon cancer_sscDNA 


1.0 


94993_CFPAC-l_Pancreatic 
ductal 

adenocarcinoma sscDNA 


3.0 
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0.0 


94994 PANC-1 Pancreatic 
epithelioid ductal 
carcinoma sscDNA 


100. 
0 


94933_NCI-H71 6_CoIon caiicer_sscDNA 


6.1 


94996_^T24__Bladder 
carcinma (transitional 
cell) sscDNA 


49.0 


94935_SW-48_Colon 


03 


94997_5637__Bladder 

carcinoma sscDNA 


0.5 


94936_SW1116_Colon 
adenocarcinoma sscDNA 


0.4 


94998_HT-1 197_Bladder 
carcinoma sscDNA 


5.2 


adenocarcinoma_sscDNA 


0.2 


94999JLJM-UC-3_Bladder 
carcinma (transitional 
cell) sscDNA 


62.0 


y^y J o o w -yHo__v^oion 
adenocarcinoma sscDNA 


0.0 


95000 A204 Rhabdomyosar 
coma sscDNA 


0.6 


94939_SW-480__Colon 
adenocarcinoma_sscDNA 


0.5 


95001_HT- 

lORO Fihmsarcoma sscDN 
A " 


0.1 


94940 JSfCI-SNU«5_Gastric 
care inoma_sscDN A 


1.5 


95002_MG- 

O^eosarcoma 
(bone) sscDNA 


18.7 


9494 1 KA 1 0 lll_oastric 
carcinoma_sscDNA 


1 9 
1 .z> 


1 Leiomyosarcoma 

^viilva^ <i<;cDNA 


9.3 


94943_NCI-SNU-1 6_Gastric 
carcinoma_sscDNA 


97.3 


95004_SJRH30_Rhabdomyo 
sarcoma (met to bone 
marrow'i sscDNA 


0.4 


94944_NCI-SNU-l_Gastric 
carcinoma sscDNA 


1.4 


95005_A43 l^Epidermoid 
carcinoma sscDNA 


0.4 


y4y 4D j\r - 1 ^oasiTi c 
adenocarcinoma sscDNA 


0 0 


95007 WM266- 

4 Melanoma sscDNA 


18.2 


94947_RF-48_Gastric 
auenocarcmoma_s5CJL/iN/\ 


0.0 


95010_DU 145_Prostate 
carcinoma fbrain 
metastasis) sscDNA 


0.3 


96778_MKN-45_Gastric carcinoma_sscDNA 


3.4 


95012JvlDA-^MB- 
468 Breast 

adenocarcinoma sscDNA 


0.0 


94949jtsfCI-N87_Gastric carcinoma_sscDNA 


0.3 


950 1 3_SCC-4_Squamous 
cell carcinoma of 
tongue sscDNA 


0.0 


9495 l__OVCAR-5_Ovarian 
carcinoma sscDNA 


2.0 


95014_SCC-9_Squamous 
cell carcincMna of 

tongue sscDNA 


0.7 


94952_RL95-2_Uterine carcinoma_sscDNA 


1.7 


95015_SCC-15_Squamous 
cell carcinoma of 

tonguesscDNA 


0.0 


94953_HeIaS3__Cervical 
adenocarcinoma_sscDNA 


1.2 


95017_CAL 27_Squamous 
cell carcinoma of 
tonguejsscDNA 


0.0 



542 



Table 7. TaqMan data for Panel 4, 



Tissue Name 


Kel. 

Expr. 

0/ 


Tissue Name 


p^i 
Expr. 

/o 


Secondary Thl act 


0,2 


HUVJbC iJL-iDeta 




Secondary Th2 act 


0.3 


HUVEC IFN gamma 


0.8 


Secondary Trl act 


0.6 


HUVLC liSIr aipna + IrJN gamma 


A < 


Secondary Thl rest 


0.1 


HUVEC TNF alpha + IL4 


3.0 


Secondary Th2 rest 


0.0 


TT¥ TT TW^ TT T 

HUVEC IL-1 1 


1.2 


Secondary Trl rest 


0.1 


Lung Microvascular EC none 


y.o 


Primary Thl act 


0.0 


Lung Microvascular EC TNFalpha + 
IL-lbeta 


ILO 


Primary Th2 act 


0.0 


Microvascular Dermal EC none 




Primary Trl act 


0.1 


Microsvasular Dermal EC TNFalpha 
H- IL-lbeta 


9.9 


Primary Thl rest 


0.1 


Bronchial epithelium TNFalpha + 
ILlbeta 


3.7 


Primary Th2 rest 


0.1 


Small airway epithelium none 




Primary Trl rest 


0.1 


Small airway epithelium TNFalpha + 
IL-lbeta 


100.0 


CD45RA CD4 lymphocyte act 


0.1 


Coronery artery SMC rest 


6.8 


CD45RO CD4 lymphocyte act 


0.1 


Coronery artery SMC TNFalpha + 
IL-lbeta 


2.1 


CD8 lymphocyte act 


0.1 


Astrocytes rest 


5.3 


Secondary CDS lymphocyte rest 


0.3 


Astrocytes TNFalpha + IL-lbeta 


O 1 

8.1 


Secondary CDS lymphocyte act 


0.2 


KU-812 (Basophil) rest 


0.1 


CD4 lymphocyte none 


0.0 


KU-812 (Basophil) PMA/ionomycin 


A O 
U.O 


2ry Th 1 /Th2/Tr 1 anti-CD95 CHI 1 


0-1 


CCDl 106 (Keratmocytes) ncMie 




LAK cells rest 


0.1 


CCDl 106 (Keratmocytes) TNFaJpna 

+ IL-lbeta 


2.1 


LAK cells IL-2 


0.5 


Liver cirrhosis 


1 'X 
1 .D 


T A W TT ^ \ TT 1 ^ 

LAK cells IL-2+IL-12 


0.1 


Lupus kidney 


U. 1 


LAK cells IL-2+IFN gamma 


0.1 


NC1-H292 none 


0.2 


LAK cells IL-2+ IL-18 


0.2 


NCl-H2y2 lL-4 


1 n 
l.U 


LAK cells PMA/ionomycm 


35.9 


NCl-H2y2 IL-y 


U.J 


NK Cells IL-2 rest 


0.1 


NC1-H292 lL-13 


A ^ 
U.O 


Two Way MLR 3 day 


0.2 


NC1-H292 IrN gamma 


U.4 


fT^ TT r ^T T^ f 

Two Way MLR 5 day 


0.1 


HPAEC none 




Two Way MLR 7 day 


0.1 


HPAEC TNF alpha + lL-1 beta 


O.D 


roMC rest 


A 1 

u.l 


T m«% /T T* r%*«/^r^l o erf" T>i^nA 

L«ung iiDroDiasi none 




PBMC PWM 


0.4 


Lung fibroblast TNF alpha + IL-1 

beta 


0.7 


PBMC PHA-L 


1-9 


Lung fibroblast IL-4 


14.8 


Ramos (B cell) none 


0.0 


Lung fibroblast IL-9 


4.1 


Ramos (B cell) ionomycin 


0.1 


Lung fibroblast IL-1 3 


7.0 


B lymphocytes PWM 


0.7 


Lung fibroblast IFN gamma 


13.0 


B lymphocytes CD40L and IL-4 


0.3 


Dermal fibroblast CCD1070 rest 


2.2 


EOL-1 dbcAMP 


0.3 


Dermal fibroblast CCD1070 TNF 


1.1 
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alpha 




EOL-1 dbcAMP PMA/ionomycin 


0.3 


Dermal fibroblast CCD1070 IL-1 
beta 


0.8 


Dendritic cells none 


0.4 


Dermal fibroblast IFN gamma 


1.1 


Dendritic cells LPS 


0.8 


Dermal fibroblast IL-4 


7.9 


Dendritic ceils anti-CD40 


0.3 


ffiD Colitis 1 


0.2 


Monocytes rest 


0.2 


IBD Colitis 2 


0.8 


Monocytes LPS 


0.1 


IBD Crohn's 


2.4 


Macrophages rest 


0.2 


Colon 


3.8 


Macrophages LPS 


0.5 


Lung 


4.6 


HUVEC none 


1.6 


Thymus 


0.8 


HUVEC starved 


1.0 


Kidney 


3.6 



Table8.RTQ-PCR for panel A/I 



Tissue Name 


Rel. Expn, % 
tm8144t a^012 bl 


110967 COPD-F 


3.2 


1 10980 COPD-F 


1.7 


110968 COPD-M 


3.6 


110977 COPD-M 


3.9 


1 10989 Emphysema-F 


2.7 


1 10992 Emphysema-F 


1.2 


110993 Emphysema-F 


2.1 


1 10994 Emphysema-F 


1.7 


1 10995 Emphysema-F 


1.9 


1 10996 Emphysema-F 


0.3 


1 10997 Asthma-M 


0.8 


111001 Asthma-F 


0.7 


111002Asthma-F 


1.3 


1 1 1003 Atopic Asthma-F 


2.5 


1 1 1004 Atopic Asthma-F 


2.5 


1 1 1005 Atopic Asthma-F 


1.4 


1 1 1 006 Atopic Asthma-F 


0.5 


111417AIlergy-M 


0.9 


1 12347 AIIergy-M 


0.1 


112349 Normal Lung-F 


0.0 


112357 Normal Lung-F 


5.5 


1 12354 Normal Limg-M 


1.1 


1 12374 Crohns-F 


0.9 


1 12389 Match Control Crohns-F 


1.9 


112375 Crohns-F 


1.0 


1 12732 Match Control Crohns-F 


2.4 


1 12725 Crohns-M 


0.1 


1 12387 Match Control Crohns-M 


0.7 


112378 Crohns-M 


0.1 
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1 12^90 Match Control Crohns-M 


1.9 


112726 Crohns-M 


3.2 


1 12731 Match Control Crohns-M 


1.1 


112380 Ulcer Col-F 


3.4 


1 12734 Match Control Ulcer Col-F 


4.6 


1 12384 Ulcer Col-F 


4.3 


1 1 2737 Match Control Ulcer Col-F 


L4 


1 12386 Ulcer Col-F 


1.2 


1 1 2738 Match Control Ulca* Col-F 


2.2 


1 12381 Ulcer Coi-M 


0.2 


n97^S Match Control Ulcer Col-M 


0.4 


1 12382 Ulcer Col-M 


2.2 


1 12394 Match Control Ulcer Col-M 


0.3 


1 12383 Ulcer Col-M 


1.5 


1 1 27^6 Match Control T Ilcer Col-M 


1.7 




1.8 


1 1 2477 Match Control Psoria'^i's-F 


3.9 




3.2 


1 1979'? AyfatcVi r^ontrol Peoria «iq-M^ 


5 1 


1 19410 P<:oria<:i<:-M 


3 1 


i I-^H^'t IVImdl v^LrilliiJI X^Slil loiSlij— IVl 


0 Q 


1 19vl9fi Pcrkric»cic-N>f 


7 2 


1 1949^ \yfatch r^ontrol Pcor5ac5c-K/f 


2.2 




58.2 


104^00 rMF\ Arii 'TSJnrmal" Ronp-Rackiiq 


71 9 




90 ^ 




81 0 




13 5 




48 6 


AOAfkQfi (V\A\ OA ^vnoviiim-Rackii*; 


37 4 


1 04700 ('SS'i OA Rone-Backus 


36.8 


104701 ^SS"* Adl 'TsFormal" Rone-Racku^ 


42.1 


1 04702 CSS'^ OA Svnovium-Backus 


100.0 


1 170Q3 OA Cartilage Ren7 


4.2 


1 1 2672 OA BoneS 


4.5 


112673 OA Synoviums 


1.6 


1 12674 OA Synovial Fluid cetlsS 


2.0 


1 17100 OA Cartilaee ReDl4 


1.9 


1 1 2756 OA Bone9 


1.8 


1 1 27S7 OA Synoviiim9 


4.5 


1 12758 OA Synovial Fluid Cells9 


1.7 


1 17125 RA Cartilage Rep2 


13.6 


113492 Bone2RA 


3.0 


1 13493 Synovium2 RA 


1.0 


1 13494 Syn Fluid Cells RA 


2.8 


113499 Cartilage4RA 


2.4 


1 13500 Bone4RA 


2.6 


113501 Synovium4RA 


1.8 


1 13502 Syn Fluid Cells4 RA 


1.5 



1 13495 Cartilages RA 


1.8 


113496 Bone3 RA 


1.7 


1 13497 Synoviums RA 


0.8 


1 13498 Syn Fluid Cells3 RA 


3.0 


1 17106 Normal Cartilage Rep20 


7.3 


1 13663 Bone3 Normal 


0.2 


1 13664 Synoviums Normal 


0.0 


1 13665 Syn Fluid CellsS Nornial 


0.0 


1 171 07 Normal Cartilage Rep22 


1.5 


1 13667 Bone4 Normal 


0.8 


1 13668 Synovium4 Normal 


0.8 


1 13669 Syn Fluid Cells4 Normal 


1.7 



ARP is overexpressed in 3/5 clear cell renal cell carcinomas, 0/2 papillary renal cell 
carcinomas and 0/2 uncharacterized renal cell carcinomas (panel 2D). Furthermore ARP is 
expressed in fetal kidney and renal cell carcinoma- derived cell lines but not in adult kidney 
(panel 1 .3D), an indication of an oncofetal expression pattern often associated with genes 
involved in kidney development and organogenesis and kidney tumorgenesis. 

Data from Panel 4D indicates that upon immune-stimulation of the airway epithelial 
cells and lung fibroblasts, ARP is expressed at increased levels. Specifically, it is show that 
expression of ARP in small airway epithelial cells treated with TNF alpha and IL-1 beta is 
up-regulated ca. 5.4 fold relative to untreated cells. In addition, expression in normal human 
lung fibroblast cells treated with IL-4, IL-9, IL-9, IL-1 3 and Interferon gamma is upregulated 
7.4, 2, 3.5 and 6.5 fold, respectively, compared to that in resting cells. Finally, expression of 
ARP in LAK cells treated with PMA/ionomycin is upregulated over 350 fold relative to the 
expression in resting cells. These data indicate that ARP plays a role in inflammation related 
to the above cells of the pulmonary system and is thereby implicated as a target for 
therapeutic intervention by protein and antibody therapeutics as well as small molecule 
pharmaceuticals. A wholly human antibody directed at ARP, for example, may diminish the 
symptoms of patients with allergy, asthma or emphysema. 

Studies have indicated that PMA induces down-regulation of LAK cell-mediated 
cytotoxicity (by inactivation of protein kinase C activity in LAK cells). The exact role of 
ARP is not known in LAK cells, however, based on the TaqMan data presented in present 
invention, ARP plays a role in inflammation and may be implicated in the ability of LAK 
cells to effectively destroy tumor cells as well. Therefore a therapeutic antibody directed 
against ARP (and thereby preventing ARP from being upregulated), may be therapeutic in 
treating cancer because of the resulting increased activity of LAK cells. 
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Data from Panel A/I illustrates that ARP transcript is highly expressed in joint tissue 
from Osteoarthritic patients, but not in tissue from joint tissue from normal patients. ARP is 
a target of peroxisome proliferator-activated receptor-gamma (PPARG) and may have a role 
in regulation of systemic lipid metabolism or glucose homeostasis. The data presented on the 
A/I panel, and from studies done with PPARG are consistent with ARP also functioning in 
the development and pathogenesis of osteoarthritis. 

Table 9: SAGE Data 



Hs8B13 : PPARte^mma} an^opoI«tln related protein 

SAGE Bbrarv data and reltsfcle tag summaiv: 

Reliable tags found in SAGE Wsfanes" 







15 <*se^ 


1 


64631 


no-DHT 


SAGE SciencePa-k MCF7 


16 


1 


61079 


Control Oh 


SAGE Duke DO^cnsis 




1 


71TO2 


fibroblasts 










SAGE Duke 1273 




3 


38838 


SAGE Duke thalanuis 


123 


3 


24371 


SAGE CAPAN2 


43 


1 


23222 


SAGE HS766T 


286 


3 


10467 


SAGE Panel 


80 ^ 


2 


24879 


SAGEHX 


279 ^mm 


9 


32167 


SAGE HI 26 


215 


7 


32420 


SAGEDJ«H54lacZ 


14 ^ 


1 


67101 


SAGE Duke ^ EGFRv« 


87 


5 


57164 


SAGE Duke H292 


17 


1 


67529 


SAGE Duke GEM H11 10 


42 ^ 


3 


70061 


SAGE SW837 


18 ^ 


1 


60986 


SAGE RKO 


57 


3 


52064 


SAGE ? prostate 


15 -^8^ 


1 


65103 


tumor 




SAGE Doolsd GBM 


80 


5 


61841 


SAGE 3B542 whitematter 


84 • 


8 


94808 


SAGENHACSthl 


m • 


5 


52198 


SAGE nomnal DOoKBth) 


3t 


2 


63064 


SAGE Pane 91-16113 


83 ♦ 


3 


33941 


SAGE Pane 96-6252 


27 


1 


35745 


SAGE CVl 063-3 


25 « 


1 


38938 


SAGETJ98 


20 


1 


490O5 


SAGE SciencePark MCF7 
ConEPT)! Oh 


IB ^ 


1 


61079 


SAGEPedGBM1062 


33 


2 


59935 


SAGE HOSE 4 


82 


4 


48413 


SAGE Duke HMVEC 


152 ^ 


8 


52532 


SmB Duke HMVEC+\^GF 




9 


57928 


SAGE mannrrar/ eojthelium 


8t mt^ 


4 


49167 


SAGE GVT-8 


29 


1 


33575 


SAGE Duke 40N 


140 


1 


7142 


SAGE Duke 48N 


248 


3 


12091 


SAGE Duke f€247 HvDOXia 


125 


S 


71937 


SAGE DOS 2 


173 «^ 


5 


28888 


SAGE 3f N 


106 


4 


37558 


SAGEK3SE39-11 


61 


3 


48496 


SAGE Duke m043 


26 ^ 


2 


76673 
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Example 6> Comparing expression of ARP with vascular endothelial growth 

FACTOR ( VEGF> EXPRESSION^ 

Paradis and coworkers assessed VEGF expression in a large series of renal tumors 
with a long follow-up, correlated with the usual histo-prognostic factors and survival. Their 
study revealed that in the group of clear cell RCCs, VEGF expression was positively 
correlated with both nuclear grade (IM).05) and size of the tumor (P=0,05). Furthermore, a 
significant correlation was observed between VEGF expression and microvascular coimt 
(P=0.04). Finally, cumulative survival rate was significantly lower in the group of patients 
with clear ceil RCCs expressing VEGF (log rank test, PH).01). In the Cox model, VEGF 
expression was a significant independent predictor of outcome, as well as stage and nuclear 
grade, (Paradis V, Lagha NB, Zeimoura L, Blanchet P, Eschwege P, Ba N, Benoit G, Jardin 
A, Bedossa P. Expression of vascular endothelial growth factor in renal cell carcinomas. 
Virchows Arch 2000 Apr;436(4):351-6). The expression profile of VEGF was compared with 
the expression profile of ARP. As shown in figure 3, ARP overexpression is higher and more 
specific than VEGF, indicating that it could be used as a better clinical marker and that more 
efficacious and specific therapeutics can be directed at regulating ARP expression. These 
results also indicate that a treatment that modulates the expression of VEGF and ARP at the 
same time may achieve synergistic effects. An example of a treatment that can mitigate the 
effects of the expression of both VEGF and ARP is a bispecific antibody directed both these 
targets. The bi-specific antibody contemplated to be within the scope of claims for this 
invention may be an antibody generated by quadroma technology, or by chemical cross- 
linking of mono-specific antibodies (one directed against VEGF, the other against ARP) or a 
bi-specific single chain antibody dimer. Formulations of single chain antibodies may include, 
but not limited to: VL(a)-Linker-VH(a)-Linker-VL(b)-Linker-VH(b). For examples of 
bispecific antibodies see: US Patent 6,030,792 by Ottemess et al., the references therein 
included here. Multivalent single chain antibodies, US Patents 5,892,020, 5,877,291 by 
Mezes et al., US Patent 6,071,515: Dimer and multimer forms of single chain polypeptides by 
Mezes et al., and US Patent 6,121,424: Multivalent antigen-binding proteins by Whitlow et 
al. See Figure 1 . 
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>^P0-B95 493 yr^ 23 .91 ^^gO^g^^ ^^g^^ [:: M M 

• g0^d312i13L21 393 67 3 .86 ^^127 p^—, 

• fi0^fc13UL 896 ur^ 52 t (^3 B) (^1?^) b M 



OTHER EMBODIMENTS 
Although particular embodiments have been disclosed herein in detail, this has been done by 
way of example for purposes of illustration only, and is not intended to be limiting with 
respect to the scope of the appended claims, which follow. In particular, it is contemplated 
by the inventors that various substitutions, alterations, and modifications may be made to the 
invention without departing from the spirit and scope of the invention as defined by the 
claims. The choice of nucleic acid starting material, clone of interest, or library type is 
believed to be a matter of routine for a person of ordinary skill in the art with knowledge of 
the embodiments described herein. Other aspects, advantages, and modifications considered 
to, be within the scope of the following claims. 
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